No Organization - Organizations

ImageNet 64x64

The 64x64 ImageNet dataset is used for training a vector-quantized variational auto-encoder, encoding images into a tensor of latents.

Dataset
JSON

Pen

The Pen dataset consists of pen-based user interfaces for anomaly detection involving user input patterns.

Dataset
JSON

Optical

The Optical dataset is a collection of optical character recognition data used for detecting anomalies in text recognition.

Dataset
JSON

Satellite

The Satellite dataset includes satellite imagery data used for anomaly detection tasks, identifying unique patterns in the images.

Dataset
JSON

Letter

The Letter dataset contains handwritten letters for anomaly detection tasks, where outliers represent specific letter patterns.

Dataset
JSON

HatEval dataset

The HatEval dataset provides annotated tweets to evaluate hate speech detection, specifically concerning immigrants and women in a multilingual context.

Dataset
JSON

affNIST

The affNIST dataset is created by applying various affine transformations to the MNIST digits, making it suitable for testing algorithms designed to handle geometric distortions.

Dataset
JSON

Synthesized Dataset of Stylized and Real Face Pairs

A large-scale synthesized dataset of stylized face (SF) and ground-truth real face (RF) pairs is generated to train the Identity-preserving Face Recovery from Portraits (IFRP)...

Dataset
JSON

NP Bracketing Dataset

The NP bracketing dataset used for classifying noun phrases based on bracketing.

Dataset
JSON

TREC Question Classification Dataset

The TREC dataset for question classification, containing questions tagged with categories including person and location.

Dataset
JSON

Stanford Sentiment Treebank

The Stanford Sentiment Treebank (SST-5) dataset is utilized for sentiment analysis, allowing models to evaluate sentiment in sentences through a tree-structured representation.

Dataset
JSON

English Wikipedia Dataset

The dataset consists of English Wikipedia articles used to train word vector models, containing 5.3M articles, 83M sentences, and 1,676M tokens.

Dataset
JSON

Annotated Humor Detection Dataset

The dataset consists of 3543 annotated tweets, featuring 1755 labeled as humorous and 1698 as non-humorous. It was used for evaluating the effectiveness of bilingual word...

Dataset
JSON

Flickr30k Captions

Flickr30k Captions dataset consists of 30,000 images with five captions per image, facilitating research on image captioning.

Dataset
JSON

Webvision

Webvision dataset is employed for training the model and evaluating its effectiveness in real-world scenarios.

Dataset
JSON

STS Benchmark Dataset

The STS Benchmark dataset was used for evaluating the semantic similarity of sentence pairs, focusing on hand-crafted and no hand-crafted feature approaches.

Dataset
JSON

SemEval-2017 Semantic Textual Similarity Dataset

The SemEval-2017 dataset for Semantic Textual Similarity includes monolingual and cross-lingual sentence pairs for evaluating semantic similarity.

Dataset
JSON

SemEval-2016 Semantic Textual Similarity Dataset

The SemEval-2016 dataset for Semantic Textual Similarity was used to evaluate sentence pairs by training models with 90% of the data for training and 10% for validation.

Dataset
JSON

German Traffic Sign Recognition Benchmark (GTSRB)

The GTSRB dataset consists of images of German traffic signs, utilized in the paper for evaluating the classification error and the impact of alignment on recognition.

Dataset
JSON

MNIST Cluttered dataset

The MNIST Cluttered dataset consists of images containing handwritten digits situated within a cluttered background, intended for assessing object recognition capabilities.

Dataset
JSON

24,167 datasets found