Search Results for author: Venkatesh Ravichandran

Found 11 papers, 1 papers with code

Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition

no code implementations • 28 Mar 2024 • Yash Jain, David Chan, Pranav Dheram, Aparna Khare, Olabanji Shonibare, Venkatesh Ravichandran, Shalini Ghosh

Recent advances in machine learning have demonstrated that multi-modal pre-training can improve automatic speech recognition (ASR) performance compared to randomly initialized models, even when models are fine-tuned on uni-modal tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion

no code implementations • 26 Jan 2024 • Jinhan Wang, Long Chen, Aparna Khare, Anirudh Raju, Pranav Dheram, Di He, Minhua Wu, Andreas Stolcke, Venkatesh Ravichandran

We propose an approach for continuous prediction of turn-taking and backchanneling locations in spoken dialogue by fusing a neural acoustic model with a large language model (LLM).

Language Modelling Large Language Model

Paper
Add Code

Two-pass Endpoint Detection for Speech Recognition

no code implementations • 17 Jan 2024 • Anirudh Raju, Aparna Khare, Di He, Ilya Sklyar, Long Chen, Sam Alptekin, Viet Anh Trinh, Zhe Zhang, Colin Vaz, Venkatesh Ravichandran, Roland Maas, Ariya Rastrow

Endpoint (EP) detection is a key component of far-field speech recognition systems that assist the user through voice commands.

speech-recognition Speech Recognition

Paper
Add Code

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

no code implementations • 22 Dec 2023 • Anirudh S. Sundar, Chao-Han Huck Yang, David M. Chan, Shalini Ghosh, Venkatesh Ravichandran, Phani Sankar Nidadavolu

In cases where some data/compute is available, we present Learnable-MAM, a data-driven approach to merging attention matrices, resulting in a further 2. 90% relative reduction in WER for ASR and 18. 42% relative reduction in AEC compared to fine-tuning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improving fairness for spoken language understanding in atypical speech with Text-to-Speech

1 code implementation • 16 Nov 2023 • Helin Wang, Venkatesh Ravichandran, Milind Rao, Becky Lammers, Myra Sydnor, Nicholas Maragakis, Ankur A. Butala, Jayne Zhang, Lora Clawson, Victoria Chovaz, Laureano Moro-Velazquez

Spoken language understanding (SLU) systems often exhibit suboptimal performance in processing atypical speech, typically caused by neurological conditions and motor impairments.

Data Augmentation Fairness +2

Paper
Code

Cross-utterance ASR Rescoring with Graph-based Label Propagation

no code implementations • 27 Mar 2023 • Srinath Tankasala, Long Chen, Andreas Stolcke, Anirudh Raju, Qianli Deng, Chander Chandak, Aparna Khare, Roland Maas, Venkatesh Ravichandran

We propose a novel approach for ASR N-best hypothesis rescoring with graph-based label propagation by leveraging cross-utterance acoustic similarity.

Fairness Language Modelling

Paper
Add Code

Adaptive Endpointing with Deep Contextual Multi-armed Bandits

no code implementations • 23 Mar 2023 • Do June Min, Andreas Stolcke, Anirudh Raju, Colin Vaz, Di He, Venkatesh Ravichandran, Viet Anh Trinh

In this paper, we aim to provide a solution for adaptive endpointing by proposing an efficient method for choosing an optimal endpointing configuration given utterance-level audio features in an online setting, while avoiding hyperparameter grid-search.

Multi-Armed Bandits

Paper
Add Code

Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech

no code implementations • 4 Nov 2022 • Xin Zhang, Iván Vallés-Pérez, Andreas Stolcke, Chengzhu Yu, Jasha Droppo, Olabanji Shonibare, Roberto Barra-Chicote, Venkatesh Ravichandran

By fine-tuning an ASR model on synthetic stuttered speech we are able to reduce word error by 5. 7% relative on stuttered utterances, with only minor (<0. 2% relative) degradation for fluent utterances.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification

no code implementations • 8 Jul 2022 • Long Chen, Yixiong Meng, Venkatesh Ravichandran, Andreas Stolcke

Speaker identification (SID) in the household scenario (e. g., for smart speakers) is an important but challenging problem due to limited number of labeled (enrollment) utterances, confusable voices, and demographic imbalances.

Fairness Speaker Identification +1

Paper
Add Code

Enhancing ASR for Stuttered Speech with Limited Data Using Detect and Pass

no code implementations • 8 Feb 2022 • Olabanji Shonibare, Xiaosu Tong, Venkatesh Ravichandran

We propose a simple but effective method called 'Detect and Pass' to make modern ASR systems accessible for People Who Stutter in a limited data setting.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Graph-based Label Propagation for Semi-Supervised Speaker Identification

no code implementations • 15 Jun 2021 • Long Chen, Venkatesh Ravichandran, Andreas Stolcke

We show in experiments on the VoxCeleb dataset that this approach makes effective use of unlabeled data and improves speaker identification accuracy compared to two state-of-the-art scoring methods as well as their semi-supervised variants based on pseudo-labels.

Speaker Identification Speaker Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.