Search Results for author: Oleg Rybakov

Found 12 papers, 4 papers with code

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

no code implementations • 13 Dec 2023 • Shaojin Ding, David Qiu, David Rim, Yanzhang He, Oleg Rybakov, Bo Li, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Zhonglin Han, Jian Li, Amir Yazdanbakhsh, Shivani Agrawal

We conducted extensive experiments with a 2-billion parameter USM on a large-scale voice search dataset to evaluate our proposed method.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

2-bit Conformer quantization for automatic speech recognition

no code implementations • 26 May 2023 • Oleg Rybakov, Phoenix Meadowlark, Shaojin Ding, David Qiu, Jian Li, David Rim, Yanzhang He

With the large-scale training data, we obtain a 2-bit Conformer model with over 40% model size reduction against the 4-bit version at the cost of 17% relative word error rate degradation

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models

no code implementations • 24 May 2023 • David Qiu, David Rim, Shaojin Ding, Oleg Rybakov, Yanzhang He

With the rapid increase in the size of neural networks, model compression has become an important area of research.

Machine Translation Model Compression +3

Paper
Add Code

STEP: Learning N:M Structured Sparsity Masks from Scratch with Precondition

no code implementations • 2 Feb 2023 • Yucheng Lu, Shivani Agrawal, Suvinay Subramanian, Oleg Rybakov, Christopher De Sa, Amir Yazdanbakhsh

Recent innovations on hardware (e. g. Nvidia A100) have motivated learning N:M structured sparsity masks from scratch for fast model inference.

Machine Translation

Paper
Add Code

Streaming Parrotron for on-device speech-to-speech conversion

no code implementations • 25 Oct 2022 • Oleg Rybakov, Fadi Biadsy, Xia Zhang, Liyang Jiang, Phoenix Meadowlark, Shivani Agrawal

We present a streaming-based approach to produce an acceptable delay, with minimal loss in speech conversion quality, when compared to a reference state of the art non-streaming approach.

Decoder Quantization +1

Paper
Add Code

4-bit Conformer with Native Quantization Aware Training for Speech Recognition

1 code implementation • 29 Mar 2022 • Shaojin Ding, Phoenix Meadowlark, Yanzhang He, Lukasz Lew, Shivani Agrawal, Oleg Rybakov

Reducing the latency and model size has always been a significant research problem for live Automatic Speech Recognition (ASR) application scenarios.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

176

Paper
Code

A Scalable Model Specialization Framework for Training and Inference using Submodels and its Application to Speech Model Personalization

no code implementations • 23 Mar 2022 • Fadi Biadsy, Youzheng Chen, Xia Zhang, Oleg Rybakov, Andrew Rosenberg, Pedro J. Moreno

We also show that learning a speaker-embedding space can scale further and reduce the amount of personalization training data required per speaker.

Paper
Add Code

Real time spectrogram inversion on mobile phone

1 code implementation • 1 Mar 2022 • Oleg Rybakov, Marco Tagliasacchi, Yunpeng Li, Liyang Jiang, Xia Zhang, Fadi Biadsy

We present two methods of real time magnitude spectrogram inversion: streaming Griffin Lim(GL) and streaming MelGAN.

33,133

Paper
Code

Pareto-Optimal Quantized ResNet Is Mostly 4-bit

4 code implementations • 7 May 2021 • Amirali Abdolrashidi, Lisa Wang, Shivani Agrawal, Jonathan Malmaud, Oleg Rybakov, Chas Leichner, Lukasz Lew

In this work, we use ResNet as a case study to systematically investigate the effects of quantization on inference compute cost-quality tradeoff curves.

Quantization

33,128

Paper
Code

Streaming keyword spotting on mobile devices

3 code implementations • 14 May 2020 • Oleg Rybakov, Natasha Kononenko, Niranjan Subrahmanya, Mirko Visontai, Stella Laurenzo

In this work we explore the latency and accuracy of keyword spotting (KWS) models in streaming and non-streaming modes on mobile phones.

Ranked #10 on Keyword Spotting on Google Speech Commands

Audio and Speech Processing Sound

33,122

Paper
Code

Low-bit quantization and quantization-aware training for small-footprint keyword spotting

no code implementations • 19 Oct 2018 • Yuriy Mishchenko, Yusuf Goren, Ming Sun, Chris Beauchene, Spyros Matsoukas, Oleg Rybakov, Shiv Naga Prasad Vitaladevuni

We investigate low-bit quantization to reduce computational cost of deep neural network (DNN) based keyword spotting (KWS).

Quantization Small-Footprint Keyword Spotting

Paper
Add Code

THE EFFECTIVENESS OF A TWO-LAYER NEURAL NETWORK FOR RECOMMENDATIONS

no code implementations • ICLR 2018 • Oleg Rybakov, Vijai Mohan, Avishkar Misra, Scott LeGrand, Rejith Joseph, Kiuk Chung, Siddharth Singh, Qian You, Eric Nalisnick, Leo Dirac, Runfei Luo

We present a personalized recommender system using neural network for recommending products, such as eBooks, audio-books, Mobile Apps, Video and Music.

Recommendation Systems Vocal Bursts Valence Prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.