STEM-ECR-v1.0

doi:doi:10.25835/0017546

STEM-ECR-v1.0

Grounding Scientific Entity References in STEM Scholarly Content to Authoritative Encyclopedic and Lexicographic Sources

The STEM ECR v1.0 dataset has been developed to provide a benchmark for the evaluation of scientific entity extraction, classification, and resolution tasks in a domain-independent fashion. It comprises annotations for scientific entities in scientific Abstracts drawn from 10 disciplines in Science, Technology, Engineering, and Medicine. The annotated entities are further grounded to Wikipedia and Wiktionary, respectively.

What this repository contains?

The dataset is organized in the following folders:

Scientific Entity Annotations: Contains annotations for Process, Material, Method, and Data scientific entities in the STEM dataset.
Scientific Entity Resolution: Annotations for the STEM dataset scientific entities with Entity Linking (EL) annotations to Wikipedia and Word Sense Disambiguation (WSD) annotations to Wiktionary.

Annotation Guidelines

The annotation guidelines that supported the creation of this corpus can be found here.

Supporting Publication

D'Souza, J., Hoppe, A., Brack, A., Jaradeh, M., Auer, S., & Ewerth, R. (2020). The STEM-ECR Dataset: Grounding Scientific Entity References in STEM Scholarly Content to Authoritative Encyclopedic and Lexicographic Sources. In Proceedings of The 12th Language Resources and Evaluation Conference (pp. 2192–2203). European Language Resources Association.

Useful Links

Data and Resources

Cite this as

Jennifer D’Souza, Anett Hoppe, Arthur Brack, Mohamad Yaser Jaradeh, Soren Auer, Ralph Ewerth (2020). Dataset: STEM-ECR-v1.0. https://doi.org/10.25835/0017546

DOI retrieved: February 13, 2020

Additional Info

Field	Value
Imported on	October 14, 2021
Last update	August 4, 2023
License	CC-BY-SA-3.0
Source	https://data.uni-hannover.de/dataset/stem-ecr-v1-0
Version	1.0
Author	Jennifer D’Souza
More Authors	Anett Hoppe Arthur Brack Mohamad Yaser Jaradeh Soren Auer Ralph Ewerth
Author Email	Jennifer D’Souza
Maintainer	Jennifer D'Souza
Maintainer Email	Jennifer D'Souza
Source Creation	13 February, 2020, 12:57 PM (UTC+0000)
Source Modified	20 January, 2022, 11:00 AM (UTC+0000)