Search Results for author: Stefan Goetze

Found 13 papers, 6 papers with code

Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models

no code implementations • 24 Jan 2024 • Rhiannon Mogridge, George Close, Robert Sutherland, Thomas Hain, Jon Barker, Stefan Goetze, Anton Ragni

Neural networks have been successfully used for non-intrusive speech intelligibility prediction.

Decoder

Paper
Add Code

On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments

1 code implementation • 9 Oct 2023 • William Ravenscroft, Stefan Goetze, Thomas Hain

Convolution augmented transformers (conformers) have performed well for many speech processing tasks but have been under-researched for speech separation.

Ranked #3 on Speech Separation on WHAMR!

Computational Efficiency Speech Separation

Paper
Code

The Effect of Spoken Language on Speech Enhancement using Self-Supervised Speech Representation Loss Functions

1 code implementation • 27 Jul 2023 • George Close, Thomas Hain, Stefan Goetze

In this work, SE models are trained and tested on a number of different languages, with self-supervised representations which themselves are trained using different language combinations and with differing network structures as loss function representations.

Speech Enhancement

Paper
Code

Non Intrusive Intelligibility Predictor for Hearing Impaired Individuals using Self Supervised Speech Representations

no code implementations • 25 Jul 2023 • George Close, Thomas Hain, Stefan Goetze

Self-supervised speech representations (SSSRs) have been successfully applied to a number of speech-processing tasks, e. g. as feature extractor for speech quality (SQ) prediction, which is, in turn, relevant for assessment and training speech enhancement systems for users with normal or impaired hearing.

Speech Enhancement

Paper
Add Code

CADGE: Context-Aware Dialogue Generation Enhanced with Graph-Structured Knowledge Aggregation

1 code implementation • 10 May 2023 • Hongbo Zhang, Chen Tang, Tyler Loakman, Chenghua Lin, Stefan Goetze

In this paper, we propose a novel context-aware graph-attention model (Context-aware GAT), which can effectively incorporate global features of relevant knowledge graphs based on a context-enhanced knowledge aggregation process.

Dialogue Generation Graph Attention +2

Paper
Code

On Data Sampling Strategies for Training Neural Network Speech Separation Models

no code implementations • 14 Apr 2023 • William Ravenscroft, Stefan Goetze, Thomas Hain

In this work, the impact of applying these training signal length (TSL) limits is analysed for two speech separation models: SepFormer, a transformer model, and Conv-TasNet, a convolutional model.

Speech Separation

Paper
Add Code

Perceive and predict: self-supervised speech representation based loss functions for speech enhancement

no code implementations • 11 Jan 2023 • George Close, William Ravenscroft, Thomas Hain, Stefan Goetze

Recent work in the domain of speech enhancement has explored the use of self-supervised speech representations to aid in the training of neural speech enhancement models.

Speech Enhancement

Paper
Add Code

Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation

2 code implementations • 27 Oct 2022 • William Ravenscroft, Stefan Goetze, Thomas Hain

In this work deformable convolution is proposed as a solution to allow TCN models to have dynamic RFs that can adapt to various reverberation times for reverberant speech separation.

Ranked #12 on Speech Separation on WHAMR!

Speech Dereverberation Speech Separation

Paper
Code

Utterance Weighted Multi-Dilation Temporal Convolutional Networks for Monaural Speech Dereverberation

1 code implementation • 17 May 2022 • William Ravenscroft, Stefan Goetze, Thomas Hain

It is shown that this weighted multi-dilation temporal convolutional network (WD-TCN) consistently outperforms the TCN across various model configurations and using the WD-TCN model is a more parameter efficient method to improve the performance of the model than increasing the number of convolutional blocks.

Ranked #1 on Speech Dereverberation on WHAMR!

Speech Dereverberation

Paper
Code

Receptive Field Analysis of Temporal Convolutional Networks for Monaural Speech Dereverberation

1 code implementation • 13 Apr 2022 • William Ravenscroft, Stefan Goetze, Thomas Hain

A feature of TCNs is that they have a receptive field (RF) dependent on the specific model configuration which determines the number of input frames that can be observed to produce an individual output frame.

Ranked #1 on Speech Dereverberation on WHAMR_ext