Information Retrieval - Groups

Wikipedia dataset

The dataset used in the paper is the Wikipedia dataset, which contains over six million English Wikipedia articles with a full-text field associated with 50 training queries...

Dataset
JSON

Reuters21578

The problem of similarity search is to find the most similar items in a large collection to a query item of interest. Fast similarity search is at the core of many information...

Dataset
JSON

Reuters-21578

Text classiﬁcation problem has long been an interesting research ﬁeld, the aim of text classiﬁcation is to develop algorithm to ﬁnd the categories of given documents.

Dataset
JSON

20NewsGroups

The dataset used in this paper is a collection of documents from various domains, including news, articles, and emails.

Dataset
JSON

4 datasets found

Wikipedia dataset

Reuters21578

Reuters-21578

20NewsGroups