-
TEDLIUM Corpus
The TEDLIUM corpus is a large-volume corpus used for speech recognition and text summarization. -
A Hybrid CNN-BiLSTM VOICE ACTIVITY DETECTOR
A hybrid CNN-BiLSTM VOICE ACTIVITY DETECTOR for voice activity detection (VAD) incorporating both convolutional neural network (CNN) and bidirectional long short-term memory... -
TIMIT dataset
The dataset used in this paper is a collection of phonetically and phonologically local allophonic distribution in English, where voiceless stops surface as aspirated... -
Multi-Scale Octave Convolutions for Robust Speech Recognition
Multi-scale octave convolutional layers for robust speech recognition -
AMI Meeting Corpus
The AMI Meeting Corpus was collected in three instrumented rooms with meeting conversations. Each room has two microphone arrays to collect 100 hours of far-field... -
Voice Bank Corpus
The Voice Bank Corpus is a large regional accent speech database containing over 10 hours of speech data from 20 speakers. -
Sanskrit ASR dataset
A dataset for Sanskrit ASR -
वाक् सञ्चयः (/Vāksañcayah ̣/)
A new Sanskrit speech corpus and a large-vocabulary ASR system for Sanskrit -
Google Speech Command Dataset
The Google Speech Command Dataset is a dataset for keyword spotting, which is a task in speech recognition. The dataset contains 12 classes, including 10 keywords and two extra... -
Switchboard Corpus
The Switchboard corpus is a dataset of speech recordings from a switchboard, which is a device that allows multiple people to speak at the same time. -
Libri-Light
The dataset used in the paper is the Libri-Light dataset, which is a subset of the LibriSpeech dataset. The authors used this dataset to pre-train their proposed dual-mode ASR... -
Masked Acoustic Unit for Mispronunciation Detection and Correction
The proposed method uses the acoustic unit (AU) as the intermediary feature for both mispronunciation detection and correction. -
TIMIT Acoustic-Phonetic Continuous Speech Corpus
The TIMIT acoustic-phonetic continuous speech corpusCD-ROM contains a large collection of speech samples from 250 male and 250 female speakers.