Evaluating SQuAD-based Question Answering for the Open Research Knowledge Graph Completion

doi:doi:10.25835/blecbkwf

Evaluating SQuAD-based Question Answering for the Open Research Knowledge Graph Completion

This dataset is part of the bachelor thesis "Evaluating SQuAD-based Question Answering for the Open Research Knowledge Graph Completion". It was created for the finetuning of Bert Based models pre-trained on the SQUaD dataset. The Dataset was created using semi-automatic approach on the ORKG data.

The dataset.csv file contains the entire data (all properties) in a tabular for and is unsplit. The json files contain only the necessary fields for training and evaluation, with additional fields (index of start and end of the answers in the abstracts). The data in the json files is split (training data) and evaluation data. We create 4 variants of the training and evaluation sets for each one of the question labels ("no label", "how", "what", "which")

For detailed information on each of the fields in the dataset, refer to section 4.2 (Corpus) of the Thesis document that can be found in https://www.repo.uni-hannover.de/handle/123456789/12958.

The script used to generate the dataset can be found in the public repository https://github.com/as18cia/thesis_work and https://gitlab.com/TIBHannover/orkg/nlp/experiments/orkg-fine-tuning-squad-based-models

Data and Resources

Cite this as

Moussab Hrou (2022). Dataset: Evaluating SQuAD-based Question Answering for the Open Research Knowledge Graph Completion. https://doi.org/10.25835/blecbkwf

DOI retrieved: November 1, 2022

Additional Info

Field	Value
Imported on	January 12, 2023
Last update	August 4, 2023
License	CC-BY-3.0
Source	https://data.uni-hannover.de/dataset/evaluating-squad-based-question-answering-for-the-open-research-knowledge-graph-completion
Author	Moussab Hrou
Maintainer	Moussab Hrou
Source Creation	01 November, 2022, 00:05 AM (UTC+0000)
Source Modified	05 December, 2022, 04:04 AM (UTC+0000)