-
Mixtral of Experts
The dataset used in the paper for instruction following task -
speechocean762
speechocean762: An open-source non-native English speech corpus for pronunciation assessment. -
Automatic Pronunciation Assessment
A hierarchical context-aware modeling approach for multi-aspect and multi-granular pronunciation assessment -
Experimental Results
The authors evaluate the performance of their proposed conformal prediction methods for multistep feedback covariate shift (MFCS) on synthetic black-box optimization and active... -
The Online Pivot: Lessons Learned from Teaching a Text and Data Mining Course...
A text and data mining course on Natural Language Processing, adapted for online teaching during the COVID-19 pandemic. -
WordNet Noun
The dataset used in this paper is the WordNet Noun dataset, which is a collection of nouns with their semantic relationships. -
Universal Conceptual Cognitive Annotation (UCCA)
The Universal Conceptual Cognitive Annotation (UCCA) dataset is a graph-based semantic annotation scheme based on typological linguistic principles. -
Russian Noun Dataset
The dataset used for clustering contains the 2000 most frequent nouns in the Russian Web corpus. -
Spanish Noun Dataset
The dataset used for clustering contains the 2000 most frequent nouns in the Spanish Gigaword corpus. -
English Noun Dataset
The dataset used for clustering contains the 2000 most frequent nouns in the British National Corpus (BNC) and the English Gigaword corpus. -
Toward an Architecture for Never-ending Language Learning
Toward an Architecture for Never-ending Language Learning. -
Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks
Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. -
LIMP Dataset
The dataset used in the paper is a set of 35 complex and ambiguous object goal navigation and mobile pick-and-place instructions. -
NAVER Open Podium and NAVER Encyclopedia
A large dataset of Korean text. -
Sanskrit ASR dataset
A dataset for Sanskrit ASR -
वाक् सञ्चयः (/Vāksañcayah ̣/)
A new Sanskrit speech corpus and a large-vocabulary ASR system for Sanskrit