-
NP Bracketing Dataset
The NP bracketing dataset used for classifying noun phrases based on bracketing. -
TREC Question Classification Dataset
The TREC dataset for question classification, containing questions tagged with categories including person and location. -
Stanford Sentiment Treebank
The Stanford Sentiment Treebank (SST-5) dataset is utilized for sentiment analysis, allowing models to evaluate sentiment in sentences through a tree-structured representation. -
English Wikipedia Dataset
The dataset consists of English Wikipedia articles used to train word vector models, containing 5.3M articles, 83M sentences, and 1,676M tokens. -
Annotated Humor Detection Dataset
The dataset consists of 3543 annotated tweets, featuring 1755 labeled as humorous and 1698 as non-humorous. It was used for evaluating the effectiveness of bilingual word... -
Flickr30k Captions
Flickr30k Captions dataset consists of 30,000 images with five captions per image, facilitating research on image captioning. -
STS Benchmark Dataset
The STS Benchmark dataset was used for evaluating the semantic similarity of sentence pairs, focusing on hand-crafted and no hand-crafted feature approaches. -
SemEval-2017 Semantic Textual Similarity Dataset
The SemEval-2017 dataset for Semantic Textual Similarity includes monolingual and cross-lingual sentence pairs for evaluating semantic similarity. -
SemEval-2016 Semantic Textual Similarity Dataset
The SemEval-2016 dataset for Semantic Textual Similarity was used to evaluate sentence pairs by training models with 90% of the data for training and 10% for validation. -
German Traffic Sign Recognition Benchmark (GTSRB)
The GTSRB dataset consists of images of German traffic signs, utilized in the paper for evaluating the classification error and the impact of alignment on recognition. -
MNIST Cluttered dataset
The MNIST Cluttered dataset consists of images containing handwritten digits situated within a cluttered background, intended for assessing object recognition capabilities. -
Visual Commonsense Reasoning (VCR)
VCR consists of 290k questions derived from 110k movie scenes, focusing on visual commonsense reasoning. -
US-CT Dataset
A synthetic dataset developed for ultrasound and CT image registration experiments, leveraging CT images to simulate ultrasound data for matching and localization. -
Human Face Database
A human face dataset used for evaluating image alignment techniques, containing altered and deformed images of human faces for testing alignment accuracy. -
MNIST Handwritten Digits Dataset
The MNIST handwritten digits dataset is a widely used benchmark dataset that consists of 60,000 training images and 10,000 testing images of handwritten digits, allowing... -
PF-PASCAL Benchmark
The PF-PASCAL benchmark is comprised of 1,351 image pairs over 20 object categories with keypoint annotations for evaluating semantic correspondence. -
PF-WILLOW Benchmark
The PF-WILLOW benchmark contains 10 object sub-classes, each with 10 keypoint annotations for performance evaluation in semantic correspondence tasks. -
TSS Benchmark
The TSS benchmark consists of 400 image pairs divided into three groups for evaluating semantic correspondence methods. -
English Wikipedia
The English Wikipedia is widely used as a text corpus for NLP tasks.