2 datasets found

Filter Results
  • SEAME

    The dataset used for the code-switched speech recognition task, which consists of Mandarin-English code-switched corpora.
  • CommonCrawl

    CommonCrawl is a non-profit organization that provides a large corpus of web pages for research and development purposes.