Speech Recognition - Groups

Unsupervised word segmentation and lexicon discovery using acoustic word embe...

A dataset for the Zero Resource Speech Challenge 2015.
- Dataset
- JSON
Fixed-dimensional acoustic embeddings of variable-length segments in low-reso...

A dataset for the Zero Resource Speech Challenge 2015.
- Dataset
- JSON
The Zero Resource Speech Challenge 2015

A dataset for the Zero Resource Speech Challenge 2015.
- Dataset
- JSON
A segmental Bayesian framework for fully-unsupervised large-vocabulary speech...

A segmental Bayesian model for full-coverage segmentation and clustering of conversational speech audio.
- Dataset
- JSON
Amazon Alexa Dataset

A 23 thousand hour corpus of untranscribed, de-identified, far-field, English voice command and voice query speech.
- Dataset
- JSON
DeepSpeech

The DeepSpeech dataset used for evaluation of the proposed watermarking scheme.
- Dataset
- JSON
Speech Pattern Based Black-Box Model Watermarking for Automatic Speech Recogn...

The proposed black-box model watermarking framework for protecting the IP of ASR models.
- Dataset
- JSON
Query-by-example on-device keyword spotting

Query-by-example on-device keyword spotting.
- Dataset
- JSON
DailyTalk

DailyTalk: Spoken dialogue dataset for conversational text-to-speech.
- Dataset
- JSON
KWS-DailyTalk

KWS-DailyTalk is a five-shot KWS dataset aimed at detecting 15 different keywords, namely “afternoon”, “airport”, “cash”, “credit card”, “deposit”, “dollar”, “evening”,...
- Dataset
- JSON
Whisper

Whisper is a general-purpose speech recognition model.
- Dataset
- JSON
Open Subtitles dataset

The Open Subtitles dataset consists of transcriptions of spoken dialog in movies and television shows.
- Dataset
- JSON
Loss Prediction: End-to-End Active Learning for Speech Recognition

End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is complicated and expensive. Active learning is the...
- Dataset
- JSON
Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech...

This paper presents a well-known music identification method and implements it as a neural net.
- Dataset
- JSON
LRW

The LRW dataset is an English language lip reading dataset, containing 500 different words, each spoken by over 1,000 persons.
- Dataset
- JSON
WIT3 Parallel Corpus

The WIT3 parallel corpus is a large-scale corpus of transcribed and translated talks.
- Dataset
- JSON
VoxForge dataset

The VoxForge dataset is a collection of audio recordings of human speech.
- Dataset
- JSON
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction o...

Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
- Dataset
- JSON
wav2vec: Unsupervised Pre-Training for Speech Recognition

Unsupervised Pre-Training for Speech Recognition
- Dataset
- JSON
Isolet dataset

The dataset used in this paper is the Isolet dataset, which contains 4,000 13-channel audio recordings of 100 speakers.
- Dataset
- JSON

160 datasets found