-
SincNet: A Novel CNN Architecture for Speaker Recognition from Raw Waveforms
Speaker recognition is a very active research area with no-table applications in various fields such as biometric authentication, forensics, security, speech recognition, and... -
BUT retransmitted audio dataset
The dataset of retransmitted audio used for PLDA adaptation -
VOiCES 2019 Speaker Recognition Challenge
The dataset used for the VOiCES 2019 Speaker Recognition challenge -
Timbre Dataset Generation
The proposed model uses the timbral properties of voice, that is hardly used in any other research endeavors. The model is tested against a real-world continuous stream of... -
NIST SRE16
Speaker recognition evaluation dataset -
2000 NIST Speaker Recognition Evaluation
The dataset is used for speaker diarization tasks. -
NIST SRE 2000 CALLHOME
The dataset is used for speaker diarization tasks. -
ASVspoof2019
The ASVspoof2019 LA subset consists of three parts, training, development, and evaluation. Each partition has a disjoint set of speakers. The average duration of the utterances... -
Free Spoken Digit Dataset
The dataset is a collection of 8kHz audio recordings of spoken digits from 'zero' to 'nine'. -
VoxCeleb dataset
The VoxCeleb dataset is a large-scale speaker identification dataset, used to evaluate the performance of face recognition systems. -
TIMIT Corpus
The TIMIT corpus is a large database of speech recordings used for speaker recognition and speech synthesis tasks. -
Voxceleb2: Deep speaker recognition
Voxceleb2: Deep speaker recognition. -
A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognitio...
The Kazakh speech corpus (KSC) contains around 332 hours of transcribed audio comprising over 153,000 utterances spoken by participants from different regions and age groups, as... -
VCTK Corpus
The VCTK corpus is an English multi-speaker dataset, with 44 hours of audio spoken by 109 native English speakers.