-
AI-hub Dialogue Dataset
AI-hub dialogue dataset for Korean dialogue processing -
Incomplete Syntax Influence Korean Language Model
Syntactically Incomplete Korean (SIKO) dataset for Korean language models -
IT Job Detection Dataset
Dataset for job detection in Twitter -
Job Detection in Twitter
Job detection in Twitter using Skip-gram model and word2vec -
FST Morphological Analyser and Generator for Mapudüngun
FST Morphological Analyser and Generator for Mapudüngun -
Linguistic Data Set
The dataset used in this paper is a linguistic data set consisting of co-occurrences of 54 nouns and 58 adjectives in Charles Dickens' novel David Copperfield. -
NGEP: A Graph-based Event Planning Framework for Story Generation
NGEP: A Graph-based Event Planning Framework for Story Generation -
ECC Analyzer
The ECC Analyzer dataset is a collection of earnings conference calls (ECCs) with their corresponding transcripts and audio recordings. -
EC AI Platform
The dataset used in the paper is not explicitly described, but it is mentioned that the authors evaluated GPT-4 against three applications built with the EC AI platform for... -
Experiments with multilingual and language-specific pre-trained masked langua...
The datasets used in the experiments are annotated according to the Unimorph schema guidelines. -
SIGMORPHON 2019 datasets
The datasets developed for the SIGMORPHON 2019 lemmatization task are annotated according to the Unimorph schema guidelines. -
SNLI and MultiNLI datasets
The dataset used in the paper is the SNLI and MultiNLI datasets, which are used for natural language inference tasks. -
Sanskrit Text Annotation
The Sanskrit text is annotated with various NLP tasks, including sentence boundary detection, canonical word ordering, free-form text annotation of tokens, token classification,... -
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algori...
Six-bit quantization can effectively reduce the size of large language models and preserve the model quality consistently across varied applications. -
Furiously Can Colourless Green Ideas Sleep?
The dataset used in the paper to study the influence of context on sentence acceptability. -
Unfun.me dataset
A dataset of satirical and similar-but-serious-looking headlines collected via Unfun.me, an online game.