No Organization - Organizations

French Street Name Signs Dataset

The French Street Name Signs (FSNS) dataset contains images of French street name signs extracted from Google Streetview, featuring low resolution text lines in natural scenes...

Dataset
JSON

WFLW

WFLW contains 10,000 faces with 98 fully manually annotated landmarks, designed to be a challenging dataset with rich attribute annotations.

Dataset
JSON

300-W

300-W is currently the most widely used dataset for facial landmark detection, created from four datasets including AFW, LFPW, HELEN, and IBUG, with each image annotated with 68...

Dataset
JSON

Google Billion Word dataset

The Google Billion Word dataset is one of the largest language modeling datasets with almost one billion tokens and a vocabulary of over 800K words, based on an English corpus...

Dataset
JSON

MojiTalk

MojiTalk dataset consists of 596,959 post and response pairs from Twitter, where each response is labeled by one of 64 emojis indicating the response emotion.

Dataset
JSON

CNN/Daily Mail corpus

The CNN/Daily Mail corpus contains pairs of online news articles and their summaries, consisting of approximately 287,000 training pairs, 13,368 validation pairs, and 11,490...

Dataset
JSON

TL;DR Reddit corpus

The TL;DR Reddit corpus consists of approximately 3 million content-summary pairs mined from Reddit, designed for the TL;DR challenge focusing on text summarization.

Dataset
JSON

24,167 datasets found