-
Wikipedia dataset
The dataset used in the paper is the Wikipedia dataset, which contains over six million English Wikipedia articles with a full-text field associated with 50 training queries... -
Natural Questions
The Natural Questions dataset consists of questions extracted from web queries, with each question accompanied by a corresponding Wikipedia article containing the answer. -
Wizard of Wikipedia
Wizard of Wikipedia is a recent, large-scale dataset of multi-turn knowledge-grounded dialogues between a “apprentice” and a “wizard”, who has access to information from... -
Validation Dataset
The Validation Dataset is used for validation, it contains 1428 images from nine distinct rooms. -
Wikipedia Comparable Corpora
Multilingual dataset for topic modeling based on aligned Wikipedia articles extracted from Wikipedia Comparable Corpora