1 code implementation • 5 May 2022 • Neeraj Chhimwal, Anirudh Gupta, Rishabh Gaur, Harveen Singh Chadha, Priyanshi Shah, Ankur Dhuriya, Vivek Raghavan
To understand and evaluate the accuracy of our proposed pipeline, we introduce two metrics: Cluster Purity, and Cluster Uniqueness.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 31 Mar 2022 • Anirudh Gupta, Neeraj Chhimwal, Ankur Dhuriya, Rishabh Gaur, Priyanshi Shah, Harveen Singh Chadha, Vivek Raghavan
Automatic Speech Recognition (ASR) generates text which is most of the times devoid of any punctuation.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
no code implementations • 31 Mar 2022 • Anirudh Gupta, Rishabh Gaur, Ankur Dhuriya, Harveen Singh Chadha, Neeraj Chhimwal, Priyanshi Shah, Vivek Raghavan
For a lot of low resource languages the current approaches are still challenging, since in many cases labelled data is not available in open domain.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 30 Mar 2022 • Priyanshi Shah, Harveen Singh Chadha, Anirudh Gupta, Ankur Dhuriya, Neeraj Chhimwal, Rishabh Gaur, Vivek Raghavan
We implement our methodology in Hindi which is one of the main languages from Indic context and we think this approach is scalable to other similar languages containing a large character set.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 30 Mar 2022 • Ankur Dhuriya, Harveen Singh Chadha, Anirudh Gupta, Priyanshi Shah, Neeraj Chhimwal, Rishabh Gaur, Vivek Raghavan
We study the effect of applying a language model (LM) on the output of Automatic Speech Recognition (ASR) systems for Indic languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 30 Mar 2022 • Harveen Singh Chadha, Priyanshi Shah, Ankur Dhuriya, Neeraj Chhimwal, Anirudh Gupta, Vivek Raghavan
The decoding information from a multilingual model is used for language identification and then combined with monolingual models to get an improvement of 50% WER across languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 30 Mar 2022 • Harveen Singh Chadha, Anirudh Gupta, Priyanshi Shah, Neeraj Chhimwal, Ankur Dhuriya, Rishabh Gaur, Vivek Raghavan
We present Vakyansh, an end to end toolkit for Speech Recognition in Indic languages.
2 code implementations • 15 Jul 2021 • Anirudh Gupta, Harveen Singh Chadha, Priyanshi Shah, Neeraj Chhimwal, Ankur Dhuriya, Rishabh Gaur, Vivek Raghavan
We present a CLSRIL-23, a self supervised learning based audio pre-trained model which learns cross lingual speech representations from raw audio across 23 Indic languages.