Whisper Large-v3 Release

github.com

Whisper Large-v3 Release

github.com

Even_Adder@lemmy.dbzer0.com to

Opensource@kbin.socialEnglish · 1 year ago

`large-v3` release · openai/whisper · Discussion #1762

github.com

We're pleased to announce the latest iteration of Whisper, called large-v3. Whisper-v3 has the same architecture as the previous large models except the following minor differences: The input uses ...

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

The large-v3 model shows improved performance over a wide variety of languages, and the plot below includes all languages where Whisper large-v3 performs lower than 60% error rate on Common Voice 15 and Fleurs, showing 10% to 20% reduction of errors compared to large-v2:

You must log in or # to comment.

Chat

Opensource@kbin.social

opensource@kbin.social

Create a post

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

This magazine is dedicated to discussions on open source software, hardware, and technology. Whether you are a developer, a tech enthusiast, or simply interested in the philosophy of open source, this is the place for you. Here you can share your knowledge, ask questions, and engage in discussions on topics such as open source programming languages, operating systems, hardware, and more. From the benefits and challenges of open source to the latest developments and trends, this category covers a wide range of topics related to open source.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
2 users / month
13 users / 6 months
1 local subscriber
1 subscriber
193 Posts
62 Comments
Modlog

mods: