Speech Recognition - Groups

Data set B

The dataset used for performing continuous speech recognition experiments using EEG features.

Dataset
JSON

Data set A and B

The dataset used for performing isolated and continuous speech recognition experiments using EEG features.

Dataset
JSON

LibriSpeech: An ASR Corpus Based on Public Domain Audio Books

LibriSpeech: an ASR corpus based on public domain audio books.

Dataset
JSON

Tedlium3

Tedlium3: A large-scale English speech corpus for speaker adaptation.

Dataset
JSON

GTZAN dataset

The GTZAN dataset is a small but popular dataset for genre classification, containing 10 musical genres, with each genre having 100 audio snippets of 30 s length.

Dataset
JSON

Free Spoken Digit Dataset

The dataset is a collection of 8kHz audio recordings of spoken digits from 'zero' to 'nine'.

Dataset
JSON

Lwazi speech corpus

Collecting and evaluating speech recognition corpora for nine southern bantu languages

Dataset
JSON

NCHLT speech corpus

The NCHLT speech corpus of the South African languages

Dataset
JSON

EasyASR

The dataset used in this paper is EasyASR, a distributed machine learning platform for end-to-end automatic speech recognition.

Dataset
JSON

INT8 Winograd Acceleration for Conv1D Equipped ASR Models Deployed on Mobile ...

The dataset used in this paper is a Conv1D equipped ASR model deployed on mobile devices.

Dataset
JSON

Attention-based beamformers for multi-channel speech recognition

The proposed 2D Conv-Attention model is compared with a traditional neural beamformer and multi-head attention based model.

Dataset
JSON

People’s Speech

The People’s Speech: A large-scale diverse English speech recognition dataset for commercial usage.

Dataset
JSON

LIBRIHEAVY: A 50,000 HOURS ASR CORPUS WITH PUNCTUATION CASING AND CONTEXT

Libriheavy is a large-scale ASR corpus consisting of 50,000 hours of read English speech derived from LibriVox. To the best of our knowledge, Libriheavy is the largest...

Dataset
JSON

CHiME-2

The CHiME-2 dataset is a speech separation and recognition challenge dataset. It contains 7138 utterances of 8 speakers, each with 10 seconds of speech.

Dataset
JSON

MHINT

The MHINT corpus is a Mandarin Chinese speech corpus used for speech recognition and speech enhancement. It contains 480 utterances of 10 speakers, each with 10 seconds of speech.

Dataset
JSON

DeepMine

DeepMine is a Persian speech corpus.

Dataset
JSON

Google Speech Command

Google Speech Command (GSC) is a dataset for limited-vocabulary speech recognition.

Dataset
JSON

ATIS dataset

The ATIS dataset is a benchmark dataset for spoken language understanding, consisting of audio recordings and corresponding manual transcripts about humans asking for flight...

Dataset
JSON

UK-PODS-ALIGN

This work showcases a cost-effective method for generating training data for speech processing tasks. The dataset UK-PODS-ALIGN is a dataset that features modern conversational...

Dataset
JSON

MCV-10

This work showcases a cost-effective method for generating training data for speech processing tasks. The dataset MCV-10 is a multilingual dataset that contains 50 hours of...

Dataset
JSON

194 datasets found