Dataset - LDM

Universal Dependencies (UD) treebanks

The dataset used in the paper is not explicitly mentioned, but it is mentioned that the authors used the Universal Dependencies (UD) treebanks.
- Dataset
- JSON
MACHAMP

MACHAMP is a toolkit for multi-task learning in NLP, supporting a wide range of NLP tasks.
- Dataset
- JSON
Data Management Operations and Recipes

A dataset management operations and recipes for NLP data production
- Dataset
- JSON
A Workﬂow Manager for Complex NLP and Content Curation Pipelines

A workﬂow manager for the ﬂexible creation and customisation of NLP processing pipelines.
- Dataset
- JSON
MatSci-NLP

The MatSci-NLP dataset is a collection of materials science text for NLP tasks.
- Dataset
- JSON
Towards Dark Jargon Interpretation in Underground Forums

Dark jargons are benign-looking words that have hidden, sinister meanings and are used by participants of underground forums for illicit behavior.
- Dataset
- JSON
ACL Anthology

The ACL Anthology dataset contains papers on natural language processing, including citation patterns, authorship, and language use over time.
- Dataset
- JSON
Cross-lingual semantic representation for NLP with UCCA

The UCCA dataset is used to test the annotation scheme in cross-lingual semantic representation for NLP.
- Dataset
- JSON
Multilingual Misinformation & Its Evolution

The dataset used in this study is a combination of data from Google Fact-Check explorer and data directly crawled from the websites of verified signatories of the International...
- Dataset
- JSON
TEL-NLP

The TEL-NLP dataset is a collection of Telugu text data for four NLP tasks: sentiment analysis, emotion identification, hate speech detection, and sarcasm detection.
- Dataset
- JSON
GLUE benchmark

The dataset used in the paper is not explicitly described, but it is mentioned that the authors used three downstream tasks from the GLUE benchmark: Stanford Sentiment Treebank...
- Dataset
- JSON
Dynahate

Dynahate: A dataset for hate speech detection.
- Dataset
- JSON
Social Chemistry 101

Social Chemistry 101 dataset is a collection of social norms and rules of thumb (ROTs) for evaluating people's behavior in everyday social situations.
- Dataset
- JSON
NLPositionality

NLPositionality is a framework for characterizing design biases and quantifying the positionality of NLP datasets and models.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

14 datasets found