no code implementations • 8 Nov 2023 • Muhammad Ali Farooq, Dan Bigioi, Rishabh Jain, Wang Yao, Mariam Yiwere, Peter Corcoran
Contemporary Human Computer Interaction (HCI) research relies primarily on neural network models for machine vision and speech understanding of a system user.
1 code implementation • 24 Jul 2023 • Rishabh Jain, Andrei Barcovschi, Mariam Yiwere, Peter Corcoran, Horia Cucu
We demonstrate that finetuning Whisper on child speech yields significant improvements in ASR performance on child speech, compared to non finetuned Whisper models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 6 Apr 2022 • Rishabh Jain, Andrei Barcovschi, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu
Our models outperformed the wav2vec2 BASE 960 on child speech which is considered a state-of-the-art ASR model on adult speech by just using 10 hours of child speech data in finetuning.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 22 Mar 2022 • Rishabh Jain, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu
Speech synthesis has come a long way as current text-to-speech (TTS) models can now generate natural human-sounding speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3