no code implementations • 8 Jan 2022 • Nauman Dawalatabad, Tushar Vatsal, Ashutosh Gupta, Sungsoo Kim, Shatrughan Singh, Dhananjaya Gowda, Chanwoo Kim
With the use of popular transducer-based models, it has become possible to practically deploy streaming speech recognition models on small devices [1].
no code implementations • 14 Dec 2020 • Chanwoo Kim, Dhananjaya Gowda, Dongsoo Lee, Jiyeon Kim, Ankur Kumar, Sungsoo Kim, Abhinav Garg, Changwoo Han
Conventional speech recognition systems comprise a large number of discrete components such as an acoustic model, a language model, a pronunciation model, a text-normalizer, an inverse-text normalizer, a decoder based on a Weighted Finite State Transducer (WFST), and so on.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 23 Jul 2020 • Kyungmin Lee, Hyunwhan Joe, Hyeontaek Lim, Kwangyoun Kim, Sungsoo Kim, Chang Woo Han, Hong-Gee Kim
Input sequences are capsulized then sliced by a window size.
no code implementations • 2 Jan 2020 • Kwangyoun Kim, Kyungmin Lee, Dhananjaya Gowda, Junmo Park, Sungsoo Kim, Sichen Jin, Young-Yoon Lee, Jinsu Yeo, Daehyun Kim, Seokyeong Jung, Jungin Lee, Myoungji Han, Chanwoo Kim
In this paper, we present a new on-device automatic speech recognition (ASR) system based on monotonic chunk-wise attention (MoChA) models trained with large (> 10K hours) corpus.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 22 Dec 2019 • Chanwoo Kim, Sungsoo Kim, Kwangyoun Kim, Mehul Kumar, Jiyeon Kim, Kyungmin Lee, Changwoo Han, Abhinav Garg, Eunhyang Kim, Minkyoo Shin, Shatrughan Singh, Larry Heck, Dhananjaya Gowda
Our end-to-end speech recognition system built using this training infrastructure showed a 2. 44 % WER on test-clean of the LibriSpeech test set after applying shallow fusion with a Transformer language model (LM).
no code implementations • 26 Nov 2018 • Sungsoo Kim, Jin Soo Park, Christos G. Bampis, Jaeseong Lee, Mia K. Markey, Alexandros G. Dimakis, Alan C. Bovik
We propose a video compression framework using conditional Generative Adversarial Networks (GANs).