-
Google LLC dataset
A dataset of audio recordings used for training and evaluation of the Binaural Angular Separation Network. -
LibriVox and Freesound datasets
A combination of LibriVox and Freesound datasets used for training and evaluation. -
Piano-midi.de
Polyphonic music tasks characterized by multivariate time-series. The analysis is performed in terms of efficiency and prediction accuracy on 4 polyphonic music tasks. -
CL4AC: A CONTRASTIVE LOSS FOR AUDIO CAPTIONING
Automated Audio captioning (AAC) is a cross-modal translation task that aims to use natural language to describe the content of an audio clip. -
Mozart Dataset
The dataset used for training the model consists of 13 pieces of Mozart, 989 pieces for validation, and 11,821 pieces for testing. -
TSP speech database
The TSP speech database is a dataset of speech recordings. -
Isolet dataset
The dataset used in this paper is the Isolet dataset, which contains 4,000 13-channel audio recordings of 100 speakers. -
Freesound Dataset
The Freesound dataset consists of 18,873 audio files, each assigned one of the 41 unique audio events from the Google's Audioset Ontology. -
Semi Supervised Learning for Few-Shot Audio Classification by Episodic Triple...
Few-shot learning aims to generalize unseen classes that appear during testing but are unavailable during training. The performance of prototypical networks in extreme few-shot... -
COVID-19 Identification ResNet (CIdeR)
The COVID-19 Identification ResNet (CIdeR) dataset consists of 517 crowdsourced coughing and breathing audio recordings from 355 participants, of which 62 participants had tested... -
VoiceBank DEMAND dataset
Speech enhancement dataset -
TIMIT dataset
The dataset used in this paper is a collection of phonetically and phonologically local allophonic distribution in English, where voiceless stops surface as aspirated...