You're currently viewing an old version of this dataset. To see the current version, click here.

NLPContributionGraph Trial Dataset

An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature

This dataset is the result of a pilot annotation exercise to capture the scholarly contributions in natural language processing (NLP) articles, particularly, for the articles that discuss machine learning (ML) approaches for various information extraction tasks. The pilot annotation exercise was performed on 50 NLP-ML scholarly articles presenting contributions to the five information extraction tasks 1. machine translation, 2. named entity recognition, 3. question answering, 4. relation classification, and 5. text classification.

The outcome of this pilot annotation exercise was two-fold: 1) a preliminary annotation methodology, and 2) the dataset released in this repository.

The resulting annotation scheme is called NLPContributions.

Supporting Publications

D’Souza, J., & Auer, S. (2020). NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature. In C. Zhang, P. Mayr, W. Lu, & Y. Zhang (Eds.), Proceedings of the 1st Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents co-located with the ACM/IEEE Joint Conference on Digital Libraries in 2020, EEKE@JCDL 2020, Virtual Event, China, August 1st, 2020 (Vol. 2658, pp. 16–27).

D'Souza, Jennifer, and Sören Auer. "Sentence, Phrase, and Triple Annotations to Build a Knowledge Graph of Natural Language Processing Contributions—A Trial Dataset." Journal of Data and Information Science, vol.6, no.3, 2021, pp.6-34. DOI: 10.2478/jdis-2021-0023

Data and Resources

Cite this as

Jennifer D’Souza, Soeren Auer (2020). Dataset: NLPContributionGraph Trial Dataset. https://doi.org/10.25835/0019761

DOI retrieved: July 3, 2020

Additional Info

Field Value
Imported on October 14, 2021
Last update August 4, 2023
License CC-BY-SA-3.0
Source https://data.uni-hannover.de/dataset/nlpcontributions-pilot-dataset
Version 2.0
Author Jennifer D’Souza
More Authors
Soeren Auer
Author Email Jennifer D’Souza
Maintainer Jennifer D'Souza
Maintainer Email Jennifer D'Souza
Source Creation 03 July, 2020, 12:08 PM (UTC+0000)
Source Modified 21 February, 2022, 13:09 PM (UTC+0000)