2 datasets found

Groups: Text Analysis

Filter Results
  • The Pile

    The Pile dataset contains 3.5 million samples of diverse text for language modeling.
  • Twitter Dataset

    The Twitter Dataset is a collection of tweets annotated with Plutchik's emotions, consisting of tweets in three different languages: English, Dutch, and German.