-
PatentEval Dataset
The PatentEval dataset is a comprehensive dataset for evaluating patent text generation. -
Big Patent Dataset
The Big Patent dataset is a large-scale dataset for abstractive and coherent summarization. -
Harvard USPTO Patent Dataset
The Harvard USPTO Dataset is a large-scale, well-structured, and multi-purpose corpus of patent applications. -
A Benchmark Dataset for Learning to Intervene in Online Hate Speech
A benchmark dataset for learning to intervene in online hate speech. -
Penn Treebank dataset
The dataset used in the paper is the Penn Treebank dataset, which is a large-scale text classification dataset. -
Ubuntu Dialogue Corpus
The Ubuntu Dialogue Corpus is the largest freely available multi-turn based dialogue corpus which consists of almost one million two-way conversations extracted from the Ubuntu... -
WikiTableQuestions
Semantic parsing maps a user-issued natural language (NL) utterance to a machine-executable meaning representation (MR), such as λ−calculus (Zettlemoyer and Collins, 2005), SQL... -
ToolWriter: Generating query-specific tools for tabular question answering
Tabular question answering (TQA) presents a challenging setting for neural systems by requiring joint reasoning of natural language with large amounts of semi-structured data. -
Penn Treebank (PTB) dataset
The Penn Treebank (PTB) dataset is used for word ordering task. The dataset is used to evaluate the performance of different models for word ordering. -
PAUSE: Positive and Annealed Unlabeled Sentence Embedding
PAUSE is a generic and end-to-end sentence embedding approach that exploits the labels and explores the unlabeled sentence pairs simultaneously. -
Leibniz University Hannover
Imported
STEM-NER-60k
A Large-scale Dataset of STEM Science as PROCESS, METHOD, MATERIAL, and DATA Named Entities This repository hosts data as a follow-up study to the following publications... -
Leibniz University Hannover
Imported
SemEval-2021 Task 11 Shared Task Dataset
NLPContributionGraph - Structuring Scholarly NLP Contributions in the Open Research Knowledge Graph Background NLPContributionGraph was introduced as Task 11 at SemEval 2021 for... -
Leibniz University Hannover
Imported
NLPContributionGraph Trial Dataset
An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature This dataset is the result of a pilot annotation exercise to... -
Leibniz University Hannover
Imported
CS-NER
Computer Science Named Entity Recognition in the Open Research Knowledge Graph 1) About This work proposes a standardized CS-NER task by defining a set of seven... -
SemEval-2021 Task 11 Shared Task Dataset
NLPContributionGraph - Structuring Scholarly NLP Contributions in the Open Research Knowledge Graph Background NLPContributionGraph was introduced as Task 11 at SemEval 2021 for...