Named Entity Recognition - Groups

ANERcorp

ANERcorp is a named entity recognition dataset.
- Dataset
- JSON
MasakhaNER 2.0

MasakhaNER 2.0 is a NER dataset in the news domain, including the annotations on 20 African languages.
- Dataset
- JSON
PaDaS-Lab/legal-reference-annotations

The dataset of privacy policies annotated using GDPR-compliant named entities.
- Dataset
- JSON
Chinese NER using lattice LSTM

Chinese NER using lattice LSTM
- Dataset
- JSON
Financial news corpus for company name recognition

Financial news corpus, company names dictionary, 35wSents dataset, Albert65kError dataset, development and test datasets
- Dataset
- JSON
CoNLL03

The CoNLL03 dataset is a low-resource named entity recognition dataset. The dataset contains 4 entity types: person, location, organization, and miscellaneous entities. The...
- Dataset
- JSON
PANX and UDPOS datasets

The PANX and UDPOS datasets are used for Named Entity Recognition and Part-of-Speech Tagging tasks among the CJKV languages.
- Dataset
- JSON
Facebook Product Name Identification Dataset

The dataset of posts from Facebook used for product name identification.
- Dataset
- JSON
SciFoodNER

A dataset of 88,526 ingredient phrases, created using Stratified Entity Frequency Sampling.
- Dataset
- JSON
Polyglot-ner

Polyglot-ner is a multilingual NER dataset.
- Dataset
- JSON
WikiGoldSK

WikiGoldSK is a manually annotated Slovak NER dataset.
- Dataset
- JSON
TAC2017 Adverse Drug Reaction Extraction Task Testing Dataset

The testing dataset used for the adverse drug reaction extraction task in TAC2017.
- Dataset
- JSON
TAC2017 Adverse Drug Reaction Extraction Task Training Dataset

The training dataset used for the adverse drug reaction extraction task in TAC2017.
- Dataset
- JSON
TAC2017 Adverse Drug Reaction Extraction Task

The dataset used for the adverse drug reaction extraction task in TAC2017.
- Dataset
- JSON
CONLL 2002

The dataset used for evaluation of the proposed model.
- Dataset
- JSON
GENIA

The GENIA dataset is a biological dataset including five entity types: DNA, RNA, protein, cell lineage, and cell type.
- Dataset
- JSON
ACE 2004

The dataset used for evaluation of the proposed model.
- Dataset
- JSON
I2B2 2009 Medical Information Extraction Challenge

Named Entity Recognition in Electronic Health Records using Transfer Learning Bootstrapped Neural Networks
- Dataset
- JSON
WikiNER

The dataset includes a larger set of English Wikipedia documents, which are tagged with named entities.
- Dataset
- JSON
ClubFloyd dataset

The ClubFloyd dataset is a collection of human transcripts of text-based games, used to train action candidate generators.
- Dataset
- JSON

60 datasets found