No Organization - Organizations

University of Maryland Reddit Suicidality Dataset

The University of Maryland Reddit Suicidality Dataset contains Reddit posts from the r/SuicideWatch subreddit, used to assess suicidality risk based on user postings.

Dataset
JSON

SVHN

The SVHN (Street View House Numbers) dataset consists of over 600,000 digit images that are cropped from street view images, used for benchmarking algorithms dealing with noisy...

Dataset
JSON

CSMSC Dataset

The CSMSC dataset is a corpus for Mandarin Chinese speech synthesis research.

Dataset
JSON

JVS Corpus

JVS corpus is a free Japanese multi-speaker voice corpus, used for various speech synthesis tasks.

Dataset
JSON

Jacquard Dataset

The Jacquard dataset is a large-scale dataset for robotic grasp detection, featuring dense grasp rectangle annotations.

Dataset
JSON

Cornell Grasping Dataset

The Cornell Grasping Dataset (CGD) contains manually-labeled grasp annotations for a limited number of examples, focusing on detecting robotic grasps.

Dataset
JSON

WMT English-German Translation

WMT English-German translation task is used for supervised conditional language generation, where the authors assess the model's performance in translating from English to German.

Dataset
JSON

MTG-Jamendo Dataset

The MTG-Jamendo dataset is used for automatically recognizing the emotions and themes in music recordings based on the raw audio, focusing on mood and theme tagging.

Dataset
JSON

Cornell Movie Dialogues

The Cornell Movie Dialogues dataset features two-character dialogues from movie scripts, capturing a large variety of human interaction in many different fictional circumstances.

Dataset
JSON

MalwareTextDB

The MalwareTextDB corpus consists of APT reports describing malware related information for text classification and token label prediction tasks.

Dataset
JSON

Holl-E

The Holl-E dataset consists of dialogues with a single document provided per conversation, including spans in documents that indicate parts used for generating responses.

Dataset
JSON

CelebA-HQ 256x256

The 256x256 CelebA-HQ dataset is utilized to train an Image Transformer for autoregressive image generation.

Dataset
JSON

ImageNet 64x64

The 64x64 ImageNet dataset is used for training a vector-quantized variational auto-encoder, encoding images into a tensor of latents.

Dataset
JSON

Pen

The Pen dataset consists of pen-based user interfaces for anomaly detection involving user input patterns.

Dataset
JSON

Optical

The Optical dataset is a collection of optical character recognition data used for detecting anomalies in text recognition.

Dataset
JSON

Satellite

The Satellite dataset includes satellite imagery data used for anomaly detection tasks, identifying unique patterns in the images.

Dataset
JSON

Letter

The Letter dataset contains handwritten letters for anomaly detection tasks, where outliers represent specific letter patterns.

Dataset
JSON

HatEval dataset

The HatEval dataset provides annotated tweets to evaluate hate speech detection, specifically concerning immigrants and women in a multilingual context.

Dataset
JSON

affNIST

The affNIST dataset is created by applying various affine transformations to the MNIST digits, making it suitable for testing algorithms designed to handle geometric distortions.

Dataset
JSON

Synthesized Dataset of Stylized and Real Face Pairs

A large-scale synthesized dataset of stylized face (SF) and ground-truth real face (RF) pairs is generated to train the Identity-preserving Face Recovery from Portraits (IFRP)...

Dataset
JSON

20,499 datasets found