Computer Science Named Entity Recognition in the Open Research Knowledge Graph
1) About
This work proposes a standardized CS-NER task by defining a set of seven contribution-centric scholarly
entities for CS NER viz., research problem , solution , resource , language ,
tool , method , and dataset .
The main contributions are:
1) Merges annotations for contribution-centric named entities from related work as the following datasets:
2) Additionally, supplies a new annotated dataset for the titles in the ACL anthology in the acl repository
where titles are annotated with all seven entities.
2) Dataset Statistics for full dataset
Titles
train.data
| NER | Count |
| --- | --- |
| solution | 65,213 |
| research problem | 43,033 |
| resource | 19,759 |
| method | 19,645 |
| tool | 4,856 |
| dataset | 4,062 |
| language | 1,704 |
dev.data
| NER | Count |
| --- | --- |
| solution | 3,685 |
| research problem | 2,717 |
| resource | 1,224 |
| method | 1,172 |
| tool | 264 |
| dataset | 191 |
| language | 79 |
test.data
| NER | Count |
| --- | --- |
| solution | 29,287 |
| research problem | 11,093 |
| resource | 8,511 |
| method | 7,009 |
| tool | 2,272 |
| dataset | 947 |
| language | 690 |
Abstracts
train-abs.data
| NER | Count |
| --- | --- |
| research problem | 15,498 |
| method | 12,932 |
dev-abs.data
| NER | Count |
| --- | --- |
| research problem | 1,450 |
| method | 839 |
test-abs.data
| NER | Count |
| --- | --- |
| research problem | 4,123 |
| method | 3,170 |
The reamining repositories have specialized README files with the respective dataset statistics.
3) Citation
Accepted for publication in ICADL 2022 proceedings.
Citation information forthcoming
Preprint
@article{d2022computer,
title={Computer Science Named Entity Recognition in the Open Research Knowledge Graph},
author={D'Souza, Jennifer and Auer, S{\"o}ren},
journal={arXiv preprint arXiv:2203.14579},
year={2022}
}
4) Additional resources
CS NER Software trained on the dataset in this repository
Codebase: https://gitlab.com/TIBHannover/orkg/nlp/orkg-nlp-experiments/-/tree/master/orkg_cs_ner
Service URL - REST API: https://orkg.org/nlp/api/docs#/annotation/annotates_paper_annotation_csner_post
Service URL - PyPi: https://orkg-nlp-pypi.readthedocs.io/en/latest/services/services.html#cs-ner-computer-science-named-entity-recognition