Search Results for author: Masato Mimura

Found 15 papers, 6 papers with code

Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM

1 code implementation • 8 Sep 2022 • Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

Connectionist temporal classification (CTC) -based models are attractive in automatic speech recognition (ASR) because of their non-autoregressive nature.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Distilling the Knowledge of BERT for CTC-based ASR

no code implementations • 5 Sep 2022 • Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

In this study, we propose to distill the knowledge of BERT for CTC-based ASR, extending our previous study for attention-based ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

ASR Rescoring and Confidence Estimation with ELECTRA

no code implementations • 5 Oct 2021 • Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

We propose an ASR rescoring method for directly detecting errors with ELECTRA, which is originally a pre-training method for NLP tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Commuting symplectomorphisms on a surface and the flux homomorphism

no code implementations • 24 Feb 2021 • Morimichi Kawasaki, Mitsuaki Kimura, Takahiro Matsushita, Masato Mimura

Let $(S,\omega)$ be a closed connected oriented surface whose genus $l$ is at least two equipped with a symplectic form.

Symplectic Geometry Group Theory Geometric Topology Primary 20F12, 20J05, 37E35, 53D35, 70H15, Secondary 20F36, 37A15, 37J05, 37J10, 57R17, 53D22

Paper
Add Code

Constellations in prime elements of number fields

no code implementations • 31 Dec 2020 • Wataru Kai, Masato Mimura, Akihiro Munemasa, Shin-ichiro Seki, Kiyoto Yoshino

Given any number field, we prove that there exist arbitrarily shaped constellations consisting of pairwise non-associate prime elements of the ring of integers.

Number Theory Combinatorics 11B30 (Primary) 11B25, 11H55, 11N05, 11R04, 05C55 (Secondary)

Paper
Add Code

End-to-end Music-mixed Speech Recognition

1 code implementation • 27 Aug 2020 • Jeongwoo Woo, Masato Mimura, Kazuyoshi Yoshii, Tatsuya Kawahara

The time-domain separation method outperformed a frequency-domain separation method, which reuses the phase information of the input mixture signal, both in simple cascading and joint training settings.

Audio and Speech Processing

Paper
Code

Distilling the Knowledge of BERT for Sequence-to-Sequence ASR

1 code implementation • 9 Aug 2020 • Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

Experimental evaluations show that our method significantly improves the ASR performance from the seq2seq baseline on the Corpus of Spontaneous Japanese (CSJ).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Bavard's duality theorem for mixed commutator length

no code implementations • 5 Jul 2020 • Morimichi Kawasaki, Mitsuaki Kimura, Takahiro Matsushita, Masato Mimura

The goal in this paper is to establish Bavard's duality theorem of $G$-invariant quasimorphisms, which was previously proved by Kawasaki and Kimura in the case $N = [G, N]$.

Group Theory Algebraic Topology Geometric Topology

Paper
Add Code

Enhancing Monotonic Multihead Attention for Streaming ASR

1 code implementation • 19 May 2020 • Hirofumi Inaguma, Masato Mimura, Tatsuya Kawahara

For streaming inference, all monotonic attention (MA) heads should learn proper alignments because the next token is not generated until all heads detect the corresponding token boundaries.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

584

Paper
Code

Generative Adversarial Training Data Adaptation for Very Low-resource Automatic Speech Recognition

1 code implementation • 19 May 2020 • Kohei Matsuura, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

We evaluated this speaker adaptation approach on two low-resource corpora, namely, Ainu and Mboshi.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

CTC-synchronous Training for Monotonic Attention Model

1 code implementation • 10 May 2020 • Hirofumi Inaguma, Masato Mimura, Tatsuya Kawahara

Monotonic chunkwise attention (MoChA) has been studied for the online streaming automatic speech recognition (ASR) based on a sequence-to-sequence framework.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

584

Paper
Code

Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu Language

no code implementations • LREC 2020 • Kohei Matsuura, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

Ainu is an unwritten language that has been spoken by Ainu people who are one of the ethnic groups in Japan.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR

no code implementations • 22 Sep 2019 • Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

Moreover, the A2C model can be used to recover out-of-vocabulary (OOV) words that are not covered by the A2W model, but this requires accurate detection of OOV words.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition

no code implementations • 22 Mar 2019 • Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara

To solve this problem, we take an unsupervised approach that decomposes each TF bin into the sum of speech and noise by using multichannel nonnegative matrix factorization (MNMF).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization

no code implementations • 31 Oct 2017 • Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara

This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech.

Speech Enhancement

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.