Audio Classification - Groups

Bird10

The Bird10 dataset is a dataset of bird species, used for training and testing deep neural networks.

Dataset
JSON

PANNS

The PANNS dataset is a large-scale audio classification dataset.

Dataset
JSON

Semi Supervised Learning for Few-Shot Audio Classification by Episodic Triple...

Few-shot learning aims to generalize unseen classes that appear during testing but are unavailable during training. The performance of prototypical networks in extreme few-shot...

Dataset
JSON

COVID-19 Cough Sub-Challenge

The dataset is used for automatic diagnosis of Covid-19 from crowdsourced respiratory sound data.

Dataset
JSON

Primate Vocalisations Corpus

The dataset is used for automated species classification of primates.

Dataset
JSON

ComParE21

The dataset is used for primates classification and Covid detection tasks.

Dataset
JSON

Audio Set

The Audio Set dataset contains information of over 2 million audio soundtracks drawn from general YouTube videos.

Dataset
JSON

Audio Set: An ontology and human-labeled dataset for audio events

The authors used the AudioSet dataset for testing their models.

Dataset
JSON

Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers

The Audio Spectrogram Transformer (AST) model is used for audio classification tasks.

Dataset
JSON

TAU Urban Acoustic Scenes 2019

The dataset used for acoustic scene classification task.

Dataset
JSON

DCASE 2019

The dataset used for acoustic scene classification, sound event detection and image classification tasks.

Dataset
JSON

VoxCeleb dataset

The VoxCeleb dataset is a large-scale speaker identification dataset, used to evaluate the performance of face recognition systems.

Dataset
JSON

VGGSound

The VGGSound dataset is a large-scale audio-visual dataset containing 10,000 10-second video clips with corresponding audio files.

Dataset
JSON

SemanticAC: SEMANTICS-ASSISTED FRAMEWORK FOR AUDIO CLASSIFICATION

A semantics-assisted framework for audio classification to better leverage the semantic information.

Dataset
JSON

Speech Commands Dataset

The dataset used for training the keyword spotting model is the ESC: Dataset for Environmental Sound Classification, and the Speech Commands Dataset.

Dataset
JSON

Speech Commands

The Speech Commands dataset consists of 105809 one-second audio recordings of 35 spoken words sampled at 16kHz. The raw speech commands dataset presents audio recordings as a...

Dataset
JSON

MIMII

A common assumption of novelty detection is that the distribution of both “normal” and “novel” data are static. This, however, is often not the case—for example scenarios where...

Dataset
JSON

ESC-50

The dataset used for training the CNN in cough detection is composed of various modified audio clips gathered from open-source online sources. Each of these audio files...

Dataset
JSON

18 datasets found