Search Results for author: Xin Lei

Found 10 papers, 3 papers with code

FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation

no code implementations • 8 Jan 2024 • Yang Liu, Li Wan, Yun Li, Yiteng Huang, Ming Sun, James Luan, Yangyang Shi, Xin Lei

Despite the potential of diffusion models in speech enhancement, their deployment in Acoustic Echo Cancellation (AEC) has been restricted.

Acoustic echo cancellation Speech Enhancement

Paper
Add Code

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

no code implementations • 5 Sep 2023 • Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra

Results demonstrate that our TODM Supernet either matches or surpasses the performance of manually tuned models by up to a relative of 3% better in word error rate (WER), while efficiently keeping the cost of training many models at a small constant.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

LiCo-Net: Linearized Convolution Network for Hardware-efficient Keyword Spotting

no code implementations • 9 Nov 2022 • Haichuan Yang, Zhaojun Yang, Li Wan, Biqiao Zhang, Yangyang Shi, Yiteng Huang, Ivaylo Enchev, Limin Tang, Raziel Alvarez, Ming Sun, Xin Lei, Raghuraman Krishnamoorthi, Vikas Chandra

This paper proposes a hardware-efficient architecture, Linearized Convolution Network (LiCo-Net) for keyword spotting.

Keyword Spotting

Paper
Add Code

SCA: Streaming Cross-attention Alignment for Echo Cancellation

no code implementations • 1 Nov 2022 • Yang Liu, Yangyang Shi, Yun Li, Kaustubh Kalgaonkar, Sriram Srinivasan, Xin Lei

End-to-End deep learning has shown promising results for speech enhancement tasks, such as noise suppression, dereverberation, and speech separation.

Speech Enhancement Speech Separation

Paper
Add Code

U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition

no code implementations • 10 Jun 2021 • Di wu, BinBin Zhang, Chao Yang, Zhendong Peng, Wenjing Xia, Xiaoyu Chen, Xin Lei

On the experiment of AISHELL-1, we achieve a 4. 63\% character error rate (CER) with a non-streaming setup and 5. 05\% with a streaming setup with 320ms latency by U2++.

Data Augmentation speech-recognition +1

Paper
Add Code

WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit

4 code implementations • 2 Feb 2021 • Zhuoyuan Yao, Di wu, Xiong Wang, BinBin Zhang, Fan Yu, Chao Yang, Zhendong Peng, Xiaoyu Chen, Lei Xie, Xin Lei

In this paper, we propose an open source, production first, and production ready speech recognition toolkit called WeNet in which a new two-pass approach is implemented to unify streaming and non-streaming end-to-end (E2E) speech recognition in a single model.

Decoder speech-recognition +1

10,219

Paper
Code

Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition

5 code implementations • 10 Dec 2020 • BinBin Zhang, Di wu, Zhuoyuan Yao, Xiong Wang, Fan Yu, Chao Yang, Liyong Guo, Yaguang Hu, Lei Xie, Xin Lei

In this paper, we present a novel two-pass approach to unify streaming and non-streaming end-to-end (E2E) speech recognition in a single model.

Ranked #6 on Speech Recognition on AISHELL-1

Decoder Sentence +2

10,219

Paper
Code

Knowledge Distillation For Recurrent Neural Network Language Modeling With Trust Regularization

no code implementations • 8 Apr 2019 • Yangyang Shi, Mei-Yuh Hwang, Xin Lei, Haoyu Sheng

Using knowledge distillation with trust regularization, we reduce the parameter size to a third of that of the previously published best model while maintaining the state-of-the-art perplexity result on Penn Treebank data.

Knowledge Distillation Language Modelling +2

Paper
Add Code

Direct Object Recognition Without Line-of-Sight Using Optical Coherence

no code implementations • CVPR 2019 • Xin Lei, Liangyu He, Yixuan Tan, Ken Xingze Wang, Xinggang Wang, Yihan Du, Shanhui Fan, Zongfu Yu

Visual object recognition under situations in which the direct line-of-sight is blocked, such as when it is occluded around the corner, is of practical importance in a wide range of applications.

Object Object Recognition

Paper
Add Code

End-To-End Speech Recognition Using A High Rank LSTM-CTC Based Model

1 code implementation • 12 Mar 2019 • Yangyang Shi, Mei-Yuh Hwang, Xin Lei

In this paper, we propose to use a high rank projection layer to replace the projection matrix.

Data Augmentation speech-recognition +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.