Speech Recognition - Groups

Fluent Speech Command dataset

The Fluent Speech Command dataset is a dataset for end-to-end spoken language understanding (SLU) tasks, consisting of single-channel audio clips sampled at 16 kHz.

Dataset
JSON

Hub5e-swb Dataset

The Hub5e-swb dataset is a dataset of speech recordings from a hub5e-swb device, which is a device that allows multiple people to speak at the same time.

Dataset
JSON

AISHELL-1

The AISHELL-1 dataset is a Mandarin speech corpus, consisting of 178 hours of speech, with 11 domains and 400 speakers from different accent areas in China.

Dataset
JSON

SEAME corpus

SEAME corpus is a Mandarin-English code-switching speech corpus.

Dataset
JSON

BABEL-Pashto

The BABEL-Pashto dataset is a multilingual speech recognition dataset containing Pashto speech recordings.

Dataset
JSON

EAT: Enhanced ASR-TTS for Self-Supervised Speech Recognition

Self-supervised ASR-TTS models suffer in out-of-domain data conditions. Here we propose an enhanced ASR-TTS model that incorporates two main features: 1) The ASR→TTS direction...

Dataset
JSON

asya

asya is a mobile application that listens to a person’s voice and provides private feedback on a person’s verbal communication.

Dataset
JSON

End-to-End Neural Speaker Diarization with Permutation-Free Objectives

The End-to-End Neural Speaker Diarization dataset is a benchmark for speaker diarization.

Dataset
JSON

The Third DIHARD Diarization Challenge

The DIHARD dataset is a benchmark for speaker diarization.

Dataset
JSON

DNS-5 dataset

The dataset used in the paper is a benchmarking dataset for speech-to-speech translation.

Dataset
JSON

Perception of Phonological Assimilation

The dataset used in this study consists of 48 stimuli, each containing a word pair with a place assimilation, and a carrier sentence. The stimuli are designed to test the...

Dataset
JSON

Transformer based Whisper Bangla ASR model

A transformer-based Whisper Bangla ASR model

Dataset
JSON

Bengali Medical Corpus

A comprehensive 46-hour Bengali medical corpus encompassing disease names, symptoms, and symptom severity.

Dataset
JSON

Highly-Reverberant Real Environment database (HRRE)

Highly-Reverberant Real Environment database (HRRE) contains 13.4 hours of data recorded in real reverberant environments and consists of 20 different testing conditions.

Dataset
JSON

Commandersong: a systematic approach for practical adversarial voice recognition

Commandersong: a systematic approach for practical adversarial voice recognition.

Dataset
JSON

Trojan-model: a practical trojan attack against automatic speech recognition ...

Trojan-model: a practical trojan attack against automatic speech recognition systems.

Dataset
JSON

Keyword spotting in continuous speech using convolutional neural network

Keyword spotting in continuous speech using convolutional neural network.

Dataset
JSON

Speech Command Dataset (SCD)

Speech Command Dataset (SCD) is a publicly available dataset of spoken English commands categorized into 35 distinct classes.

Dataset
JSON

FlowMur: A Stealthy and Practical Audio Backdoor Attack with Limited Knowledge

Speech recognition systems driven by Deep Neural Networks (DNNs) have revolutionized human-computer interaction through voice interfaces, which significantly facilitate our...

Dataset
JSON

TEDLIUM2

The TEDLIUM2 dataset is a large corpus of audio recordings of human speech, with a focus on speech recognition tasks.

Dataset
JSON

160 datasets found