Speech Recognition - Groups

ESPNet

The ESPNet dataset is a speech recognition dataset used in the ESPNet framework.

Dataset
JSON

Kaldi Speech Recognition Toolkit

The Kaldi Speech Recognition Toolkit is a widely used dataset for speech recognition.

Dataset
JSON

WAV2LETTER++

The dataset used in this paper is not explicitly mentioned, but it is implied to be a speech recognition dataset.

Dataset
JSON

Corpus of Spoken Dutch

The Corpus of Spoken Dutch (CGN) is a dataset of spoken Dutch recordings.

Dataset
JSON

Language Models of Spoken Dutch

The dataset consists of subtitles of television shows provided by the Flemish public-service broadcaster VRT. The dataset is used to train language models of spoken Dutch.

Dataset
JSON

AccentNet

The accented English speech recognition challenge 2020: Open datasets, tracks, baselines, results and methods.

Dataset
JSON

GigaSpeech

GigaSpeech: An evolving, multi-domain ASR corpus with 10,000 hours of transcribed audio.

Dataset
JSON

FASTINJECT: Injecting Unpaired Text Data into CTC-Based ASR Training

This paper proposes a flat-start joint training method, named FastInject, to inject unpaired text data into CTC-based ASR training.

Dataset
JSON

TED-LIUM 3

TED-LIUM 3 (TL3) is a TED talks dataset. Speaker adaptation data for TL3 was divided randomly, where 2/5 was divided into the train set, 1/5 was divided into the dev set, and...

Dataset
JSON

Speaker Anonymization using X-Vector and Neural Waveform Models

Speaker anonymization using x-vector and neural waveform models.

Dataset
JSON

NIST RT-03 English CTS

The dataset is used for speaker diarization tasks.

Dataset
JSON

HYPOTHESIS STITCHER FOR END-TO-END SPEAKER-ATTRIBUTED ASR ON LONG-FORM MULTI-...

An end-to-end (E2E) speaker-attributed automatic speech recognition (SA-ASR) model was proposed recently to jointly perform speaker counting, speech recognition and speaker...

Dataset
JSON

BD-4SK-ASR

The dataset used in this paper is BD-4SK-ASR, an experimental dataset which is used in the first attempt in developing an ASR system for Sorani Kurdish.

Dataset
JSON

VoiceHome-2

The dataset used in this paper is VoiceHome-2, an extended corpus for multichannel speech processing in real homes.

Dataset
JSON

ASRU 2019 Mandarin-English code-switching speech recognition challenge

The ASRU 2019 Mandarin-English code-switching speech recognition challenge dataset.

Dataset
JSON

Wall Street Journal

The Wall Street Journal dataset is used for syntactic linearization. It contains a large corpus of news articles with their corresponding syntactic trees.

Dataset
JSON

Video Corpus

A corpus of free and representative video content was gathered. This corpus includes videos having progressive scanning, 1280x720 resolution, and framerates between 24-30 frames...

Dataset
JSON

Correction Focused Language Model Training for Speech Recognition

Language models have been commonly adopted to boost the performance of automatic speech recognition (ASR) particularly in domain adaptation tasks. Conventional way of LM...

Dataset
JSON

AudioMNIST dataset

The dataset used in the paper is the AudioMNIST dataset, which contains 30,000 audio recordings.

Dataset
JSON

Convolutional Neural Networks for Speech Recognition

The Speech Recognition dataset is used for speech recognition tasks.

Dataset
JSON

194 datasets found