Machine Reading Comprehension - Groups

DREAM

The DREAM dataset is a dialogue-based multiple-choice QA dataset, introduced by Sun et al. (2019). It was collected from English-as-a-foreign-language examinations, designed by...
- Dataset
- JSON
C3

C3 is a multiple-choice reading comprehension dataset. Here we use the C3 dataset4 proposed in [32], which is a competition dataset. It is necessary to construct pseudo-evidence...
- Dataset
- JSON
CBT

The CBT dataset is an English machine reading comprehension dataset.
- Dataset
- JSON
CFT

The CFT dataset is a Chinese machine reading comprehension dataset.
- Dataset
- JSON
PD

The PD dataset is a Chinese machine reading comprehension dataset.
- Dataset
- JSON
CMRC-2017

The CMRC-2017 dataset is a Chinese machine reading comprehension dataset.
- Dataset
- JSON
Subword-augmented Embedding for Cloze Reading Comprehension

The proposed SAW Reader uses subword embedding to enhance the word representation and limit the word frequency spectrum to train rare words efficiently.
- Dataset
- JSON
ReCAM

The dataset used in the paper for multiple-choice cloze-style MRC tasks
- Dataset
- JSON
DuReader

DuReader dataset is a Chinese machine reading comprehension dataset, focusing on real-world web data
- Dataset
- JSON
MS-MARCO

MS-MARCO dataset is a large-scale question answering dataset, focusing on real-world web data
- Dataset
- JSON
MS MARCO: A Human-Generated Machine Reading Comprehension Dataset

The dataset is used for training and evaluating the MS MARCO model, a question answering model.
- Dataset
- JSON

11 datasets found