no code implementations • 24 Jan 2024 • Rhiannon Mogridge, George Close, Robert Sutherland, Thomas Hain, Jon Barker, Stefan Goetze, Anton Ragni
Neural networks have been successfully used for non-intrusive speech intelligibility prediction.
1 code implementation • 9 Oct 2023 • William Ravenscroft, Stefan Goetze, Thomas Hain
Convolution augmented transformers (conformers) have performed well for many speech processing tasks but have been under-researched for speech separation.
Ranked #3 on Speech Separation on WHAMR!
1 code implementation • 27 Jul 2023 • George Close, Thomas Hain, Stefan Goetze
In this work, SE models are trained and tested on a number of different languages, with self-supervised representations which themselves are trained using different language combinations and with differing network structures as loss function representations.
no code implementations • 25 Jul 2023 • George Close, Thomas Hain, Stefan Goetze
Self-supervised speech representations (SSSRs) have been successfully applied to a number of speech-processing tasks, e. g. as feature extractor for speech quality (SQ) prediction, which is, in turn, relevant for assessment and training speech enhancement systems for users with normal or impaired hearing.
1 code implementation • 10 May 2023 • Hongbo Zhang, Chen Tang, Tyler Loakman, Chenghua Lin, Stefan Goetze
In this paper, we propose a novel context-aware graph-attention model (Context-aware GAT), which can effectively incorporate global features of relevant knowledge graphs based on a context-enhanced knowledge aggregation process.
no code implementations • 14 Apr 2023 • William Ravenscroft, Stefan Goetze, Thomas Hain
In this work, the impact of applying these training signal length (TSL) limits is analysed for two speech separation models: SepFormer, a transformer model, and Conv-TasNet, a convolutional model.
no code implementations • 11 Jan 2023 • George Close, William Ravenscroft, Thomas Hain, Stefan Goetze
Recent work in the domain of speech enhancement has explored the use of self-supervised speech representations to aid in the training of neural speech enhancement models.
2 code implementations • 27 Oct 2022 • William Ravenscroft, Stefan Goetze, Thomas Hain
In this work deformable convolution is proposed as a solution to allow TCN models to have dynamic RFs that can adapt to various reverberation times for reverberant speech separation.
Ranked #12 on Speech Separation on WHAMR!
1 code implementation • 17 May 2022 • William Ravenscroft, Stefan Goetze, Thomas Hain
It is shown that this weighted multi-dilation temporal convolutional network (WD-TCN) consistently outperforms the TCN across various model configurations and using the WD-TCN model is a more parameter efficient method to improve the performance of the model than increasing the number of convolutional blocks.
Ranked #1 on Speech Dereverberation on WHAMR!
1 code implementation • 13 Apr 2022 • William Ravenscroft, Stefan Goetze, Thomas Hain
A feature of TCNs is that they have a receptive field (RF) dependent on the specific model configuration which determines the number of input frames that can be observed to produce an individual output frame.
Ranked #1 on Speech Dereverberation on WHAMR_ext
no code implementations • 23 Mar 2022 • George Close, Thomas Hain, Stefan Goetze
Training of speech enhancement systems often does not incorporate knowledge of human perception and thus can lead to unnatural sounding results.