-
Fluent Speech Command dataset
The Fluent Speech Command dataset is a dataset for end-to-end spoken language understanding (SLU) tasks, consisting of single-channel audio clips sampled at 16 kHz. -
Hub5e-swb Dataset
The Hub5e-swb dataset is a dataset of speech recordings from a hub5e-swb device, which is a device that allows multiple people to speak at the same time. -
SEAME corpus
SEAME corpus is a Mandarin-English code-switching speech corpus. -
BABEL-Pashto
The BABEL-Pashto dataset is a multilingual speech recognition dataset containing Pashto speech recordings. -
EAT: Enhanced ASR-TTS for Self-Supervised Speech Recognition
Self-supervised ASR-TTS models suffer in out-of-domain data conditions. Here we propose an enhanced ASR-TTS model that incorporates two main features: 1) The ASR→TTS direction... -
End-to-End Neural Speaker Diarization with Permutation-Free Objectives
The End-to-End Neural Speaker Diarization dataset is a benchmark for speaker diarization. -
The Third DIHARD Diarization Challenge
The DIHARD dataset is a benchmark for speaker diarization. -
DNS-5 dataset
The dataset used in the paper is a benchmarking dataset for speech-to-speech translation. -
Perception of Phonological Assimilation
The dataset used in this study consists of 48 stimuli, each containing a word pair with a place assimilation, and a carrier sentence. The stimuli are designed to test the... -
Transformer based Whisper Bangla ASR model
A transformer-based Whisper Bangla ASR model -
Bengali Medical Corpus
A comprehensive 46-hour Bengali medical corpus encompassing disease names, symptoms, and symptom severity. -
Highly-Reverberant Real Environment database (HRRE)
Highly-Reverberant Real Environment database (HRRE) contains 13.4 hours of data recorded in real reverberant environments and consists of 20 different testing conditions. -
Commandersong: a systematic approach for practical adversarial voice recognition
Commandersong: a systematic approach for practical adversarial voice recognition. -
Trojan-model: a practical trojan attack against automatic speech recognition ...
Trojan-model: a practical trojan attack against automatic speech recognition systems. -
Keyword spotting in continuous speech using convolutional neural network
Keyword spotting in continuous speech using convolutional neural network. -
Speech Command Dataset (SCD)
Speech Command Dataset (SCD) is a publicly available dataset of spoken English commands categorized into 35 distinct classes. -
FlowMur: A Stealthy and Practical Audio Backdoor Attack with Limited Knowledge
Speech recognition systems driven by Deep Neural Networks (DNNs) have revolutionized human-computer interaction through voice interfaces, which significantly facilitate our...