-
Expert pairs for communication game
The dataset used in the paper is a set of 30 Expert pairs trained on a communication game, with inputs and outputs of pairs of agents trained to convergence on a reconstruction... -
User Adaptive Language Learning Chatbot Dataset
The dataset used in the paper is a collection of conversations between a user-adaptive language learning chatbot and 155 8th-grade Chinese students. The chatbot is designed to... -
SUBTLEX-CH
The dataset is used to optimize the learning order of Chinese characters. -
ARC database
The ARC database contains 358,534 monosyllabic non-words. -
L1-specific Swedish non-words
This dataset contains Swedish non-words generated using a 4-gram Swedish language model. -
Cambridge First Certificate in English (FCE) dataset
The Cambridge First Certificate in English (FCE) dataset is used as the source of ESL data. The corpus is a subset of the Cambridge Learner Corpus (CLC) and contains English... -
sentencesBooks dataset
A collection of sentences from literature books (sentencesBooks), containing 56,557 labels and 2,400 total words. -
sentencesInternet dataset
A collection of sentences collected from the Internet (sentencesInternet), containing 85,941 labels and 4,800 total words. -
Literature de jeunesse libre (LjL) dataset
Literature de jeunesse libre (LjL) dataset, containing 334,026 labels and 2,060 total words. -
Experiments with GPT-3-based difficulty estimation
The dataset used for the experiments, containing three datasets: Literature de jeunesse libre (LjL), a collection of sentences collected from the Internet (sentencesInternet),... -
Mandarin tone learning dataset for Japanese learners
The dataset used in the paper is a Mandarin tone learning dataset for Japanese learners.