4 datasets found

Groups: Information Retrieval Formats: JSON

Filter Results
  • Wikipedia dataset

    The dataset used in the paper is the Wikipedia dataset, which contains over six million English Wikipedia articles with a full-text field associated with 50 training queries...
  • Reuters21578

    The problem of similarity search is to find the most similar items in a large collection to a query item of interest. Fast similarity search is at the core of many information...
  • Reuters-21578

    Text classification problem has long been an interesting research field, the aim of text classification is to develop algorithm to find the categories of given documents.
  • 20NewsGroups

    The dataset used in this paper is a collection of documents from various domains, including news, articles, and emails.