-
COVID-19 Information Retrieval and Extraction
The dataset used for COVID-19 information retrieval and extraction -
TREC 2019 and TREC 2020 Deep Learning Track datasets
TREC 2019 and TREC 2020 Deep Learning Track datasets -
MS MARCO and DL-Typo
Two datasets used in the paper: MS MARCO and DL-Typo. -
SERP dataset
The dataset used in the paper is a collection of search engine result pages (SERPs) with their corresponding relevance scores. -
Wikipedia Corpus
The dataset used in the paper is a subset of the Wikipedia corpus, consisting of 7500 English Wikipedia articles belonging to one of the following categories: People, Cities,... -
Wikipedia dataset
The dataset used in the paper is the Wikipedia dataset, which contains over six million English Wikipedia articles with a full-text field associated with 50 training queries... -
Baidu Search Dataset
The Baidu search dataset is a large-scale search dataset for unbiased learning to rank. -
ULTRE-2 Task
The ULTRE-2 task encourages participants to explore ULTR approaches to alleviate various types of biases in real user clicks during training, and achieve better ranking... -
Reuters21578
The problem of similarity search is to find the most similar items in a large collection to a query item of interest. Fast similarity search is at the core of many information... -
WordNet-Based Information Retrieval Using Common Hypernyms and Combined Features
Text search based on lexical matching of keywords is not satisfactory due to polysemous and synonymous words. Semantic search that exploits word meanings improves search... -
Tetun Test Collection
The Tetun test collection is a document-level audited dataset for relevance judgments. -
Labadain-30k+
The Labadain-30k+ dataset is a monolingual Tetun document-level audited dataset. -
Reuters-21578
Text classification problem has long been an interesting research field, the aim of text classification is to develop algorithm to find the categories of given documents. -
TREC-COVID
The TREC-COVID dataset is a collection of journal articles related to COVID-19 and other coronaviruses, with human annotators providing relevancy judgments at the end of each... -
MS MARCO, NQ, TREC DL, TREC-COVID
Four datasets are used to evaluate the retrieval effectiveness of different dimension reduction models, including MS MARCO (Passage Ranking), NQ, TREC DL, and TREC-COVID.