The Babel Program is developing agile and robust speech recognition technology that can be rapidly applied to any human language in order to provide effective search capability for analysts to efficiently process massive amounts of real-world recorded speech. Today’s transcription systems are built on technology that was originally developed for English, with markedly lower performance on non-English languages. These systems have often taken years to develop and cover only a small subset of the languages of the world. Babel intends to demonstrate the ability to generate a speech transcription system for any new language within one week to support keyword search performance for effective triage of massive amounts of speech recorded in challenging real-world situations.
The goal of the Babel Program is to develop methods to build speech recognition technology for a much larger set of languages than has hitherto been addressed. The Program requires innovations in how to rapidly model a novel language with significantly less training data that are also much noisier and more heterogeneous than what has been used in the current state-of-the-art. Babel's technical measures of success are focused on how well the generated model works to support effective word-based search of noisy channel speech in the languages to be investigated. The new methods are being systematized so that they can be applied rapidly to a novel underserved language.
Performers (Prime Contractors)
Carnegie Mellon University; IBM - T.J. Watson Research Center; Raytheon BBN Technologies; University of California, Berkeley - International Computer Science Institute
- Multilingual, multidialectal speech recognition
- Keyword search algorithms
- Speech recognition in noisy environments
- Low resource languages
- Rapid adaptation to new languages and new environments
- Machine learning
Related Publications and Websites
To access Babel program-related publications, please enter the following into a Google Scholar search query: "W911NF-12-C-0012 OR W911NF-12-C-0013 OR W911NF-12-C-0014 OR W911NF-12-C-0015"
To access the OpenKWS website, click here.
- What Happens When Spies Can Eavesdrop on Any Conversation?
- ASpIRE – IARPA Automatic Speech Recognition in Reverberant Environments Challenge
- Intelligence experts ask for speech-recognition software that works in noisy, echo-ridden rooms
- Mary Harper: IARPA Wants Speech Recognition Tech for Reverberant Environments
- US intelligence unit launches $50k speech recognition competition
- Intelligence community seeks advanced speech-to-text technology, launches contest
- IARPA Issues $50K Automatic Speech Recognition Software Challenge
- ICSI Seeks To Unravel ASR Limitations