Speech Processing - Groups

SuperB: Speech Processing Universal Performance Benchmark

Universal performance benchmark for speech processing

Dataset
JSON

Packet Loss Concealment

Packet-loss is a common problem in data transmission using Voice over IP. The problem is an old problem, and there has been a variety of classical approaches that were developed...

Dataset
JSON

Wikitext-103 and MusDB datasets

The dataset used in the paper is not explicitly mentioned, but it is mentioned that the authors trained a 16 layers transformer (Vaswani et al., 2017) based language model on...

Dataset
JSON

ANALYSING DISCRETE SELF SUPERVISED SPEECH REPRESENTATION FOR SPOKEN LANGUAGE ...

This work profoundly analyzes discrete self-supervised speech representations (units) through the eyes of Generative Spoken Language Modeling (GSLM).

Dataset
JSON

Extraction of Pitch and Formant Frequencies using Discrete Wavelet Transform

The proposed methods were applied on different signals and compared with the results obtained from the cepstrum method.

Dataset
JSON

SpeechBrain 1.0

SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker...

Dataset
JSON

Common Voice Spoken Sentence Similarity

The Common Voice Spoken Sentence Similarity dataset was created based on the test set of the English subset of Common Voice. To get the similarity of every pair of sentences in...

Dataset
JSON

speechocean762

speechocean762: An open-source non-native English speech corpus for pronunciation assessment.

Dataset
JSON

Automatic Pronunciation Assessment

A hierarchical context-aware modeling approach for multi-aspect and multi-granular pronunciation assessment

Dataset
JSON

VCTK Corpus

The VCTK corpus is an English multi-speaker dataset, with 44 hours of audio spoken by 109 native English speakers.

Dataset
JSON

WHAM!

The WHAM! dataset is used for testing the proposed Bayesian factorised speaker-environment adaptive training and test time adaptation approach for Conformer models.

Dataset
JSON

A Deep Generative Model of Speech Complex Spectrograms

This paper proposes an approach to the joint modeling of the short-time Fourier transform magnitude and phase spectrograms with a deep generative model.

Dataset
JSON

LibriTTS

A popular text-based VC approach is to use an automatic speech recognition (ASR) model to extract phonetic posteriorgram (PPG) as content representation.

Dataset
JSON

13 datasets found