5 datasets found

Tags: Natural Language Processing

Filter Results
  • WebText

    The dataset used in this paper is the WebText dataset, which is a widely used dataset for natural language processing tasks.
  • Wikipedia Corpus

    The dataset used in the paper is a subset of the Wikipedia corpus, consisting of 7500 English Wikipedia articles belonging to one of the following categories: People, Cities,...
  • Wikitext-2

    The dataset used in this paper is not explicitly described. However, it is mentioned that the authors used the Wikitext-2 dataset for text generation tasks.
  • Text8

    Word2Vec is a distributed word embedding generator that uses an artificial neural network to learn dense vector representations of words.
  • Training Transformers to Perform Tasks

    A dataset for training transformers to perform tasks such as language translation and text generation.