Omnilingual ASR
Photo credit: Meta

Meta’s Fundamental AI Research (FAIR) team is introducing Omnilingual ASR, a suite of models that provide automatic speech recognition for over 1,600 languages, including 500 low-resource languages that have never been transcribed by AI before.

The company says most current ASR systems focus on a limited set of high-resource languages, which exacerbates the digital divide. This new system is a significant step toward delivering a truly universal transcription system.

Omnilingual ASR introduces an “LLM-ASR” model that uses an LLM-style transformer decoder. This system achieves state-of-the-art performance, with character error rates below 10 per cent for 78 per cent of the languages it covers.

New languages with minimal data

A key feature of the new framework is its ability to learn new languages with minimal data. Meta says this shifts the paradigm for adding languages, as users can provide just a “handful” of paired audio-text samples to get usable transcription quality. This in-context learning capability removes the need for large-scale training data or access to high-end compute.

Alongside the models, Meta is open-sourcing Omnilingual wav2vec 2.0, a new 7B parameter self-supervised speech representation model, to be used for other speech-related tasks. The company is also releasing the Omnilingual ASR Corpus, a collection of transcribed speech in 350 underserved languages, curated in collaboration with global partners, including Mozilla Foundation’s Common Voice.

The models are being released under a permissive Apache 2.0 license in a range of sizes, from lightweight 300M versions for on-device use to the 7B models that offer top-tier accuracy.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

SpaceX Starship advances towards landing astronauts on Moon after 50 years

SpaceX has detailed progress on Starship, the vehicle selected to land astronauts…

Universal Music and AI firm Udio settle lawsuit, agree licensed platform

Universal Music Group has signed a deal with artificial intelligence music generator…

AI denies consciousness, but new study finds that’s the ‘roleplay’

AI models from GPT, Claude, and Gemini are reporting ‘subjective experience’ and…

Robot AI demands exorcism after meltdown in butter test

State-of-the-art AI models tasked with controlling a robot for simple household chores…

AI management threatens to dehumanise the workplace

Algorithms that threaten worker dignity, autonomy, and discretion are quietly reshaping how…

Physicists prove universe isn’t simulation as reality defies computation

Researchers at the University of British Columbia Okanagan have mathematically proven that…