8 datasets found

Formats: JSON Tags: audio dataset

Filter Results
  • GTZAN dataset

    The GTZAN dataset is a small but popular dataset for genre classification, containing 10 musical genres, with each genre having 100 audio snippets of 30 s length.
  • Google Speech Command Dataset

    The Google Speech Command Dataset is a dataset for keyword spotting, which is a task in speech recognition. The dataset contains 12 classes, including 10 keywords and two extra...
  • GTZAN

    The GTZAN dataset is a comprehensive collection of 1000 audio tracks, each 30 seconds long, representing ten diverse music genres.
  • Speech Commands Dataset

    The dataset used for training the keyword spotting model is the ESC: Dataset for Environmental Sound Classification, and the Speech Commands Dataset.
  • VoxCeleb: A Large-Scale Speaker Identification Dataset

    VoxCeleb: A Large-Scale Speaker Identification Dataset
  • AudioCaps

    Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates given a query in another modality.
  • Clotho

    Automated audio captioning is a cross-modal translation task for describing the content of audio clips with natural language sentences.
  • Librispeech

    The Librispeech dataset is a large-scale speaker-dependent speech corpus containing 1080 hours of speech, 5600 utterances, and 1000 speakers.
You can also access this registry using the API (see API Docs).