-
LARGE-SCALE STOCHASTIC OPTIMIZATION OF NDCG SURROGATES FOR DEEP LEARNING WITH...
The dataset used in the paper is the MSLR-WEB30K dataset and the Yahoo! LTR dataset, which are the largest public LTR datasets from commercial search engines. -
WikipassageQA, InsuranceQA v2, and MS-MARCO
The dataset contains three passage-ranking datasets: WikipassageQA, InsuranceQA v2, and MS-MARCO. -
PASSAGE RANKING WITH WEAK SUPERVISION
In this paper, we propose a weak supervision framework for neural ranking tasks based on the data programming paradigm (Ratner et al., 2016), which enables us to leverage... -
CLEF 2017 e-Health Lab Task 2
The dataset used for the experiments originated from the CLEF 2017 e-Health Lab Task 2 “Technology Assisted Reviews in Empirical Medicine”. -
Deeper text understanding for IR with contextual neural language modeling
This paper proposes a method for learning-to-rank with contextual neural language modeling. -
Learning to rank: from pairwise approach to listwise approach
This paper proposes a method for learning to rank, which is a key task in information retrieval. -
arXMLiv 2018
The arXMLiv 2018 dataset is an HTML collection of the arXiv.org preprint archive, used as a training corpus for word embedding techniques. -
Modelling Dynamic Interactions Between Relevance Dimensions
The dataset used in the paper is a user study dataset, where participants are shown query-document pairs and asked questions about different relevance dimensions. -
COVID-19 Vaccination Search Insights
COVID-19 Vaccination Search Insights dataset is a collection of anonymized search queries and their corresponding labels, which indicate whether the query is related to COVID-19... -
TREC Deep Learning 2021 Collection
The TREC Deep Learning 2021 collection is a test collection for information retrieval evaluation, adopting a shallow pooling approach. -
TREC-8 Ad Hoc Collection
The TREC-8 ad hoc collection is a test collection for information retrieval evaluation, known for its high-quality pool. -
TREC Dynamic Domain 2015 ad-hoc retrieval task
The dataset used in the paper is the TREC Dynamic Domain 2015 ad-hoc retrieval task, which includes search result diversification. The dataset consists of 23 official runs and... -
TREC Web Track 2014 ad-hoc retrieval task
The dataset used in the paper is the TREC Web Track 2014 ad-hoc retrieval task, which includes search result diversification. The dataset consists of 50 test topics and 10,000... -
Web2Text: Deep Structured Boilerplate Removal
Web pages are a valuable source of information for many natural language processing and information retrieval tasks. Extracting the main content from those documents is... -
Concept Embedding for Information Retrieval
Conceptual indexing includes the process of annotating raw text by concepts of a particular knowledge source. It is used to represent the content of documents and queries by...