-
SemEval-2010 Task 5: Automatic Keyphrase Extraction from Scientific Articles
SemEval-2010 Task 5: Automatic Keyphrase Extraction from Scientific Articles -
TREC-CAR Benchmark Y1
The dataset used for the Retrieve-Cluster-Summarize system, consisting of 117 article-level queries and 126 test queries. -
YouTube Clickbait Detection Dataset
The dataset is a collection of online videos from YouTube, with comments and metadata. It is used to evaluate the performance of the Online Video Clickbait Protector (OVCP) scheme. -
PMING Distance
PMING Distance is a measure of proximity, which conveys information on relationships between two terms, e.g. word or expressions, carrying semantic meaning, used on various... -
A citation-based method for automatic indexing of Chinese academic literatures
The dataset used in this paper for citation-based method for automatic indexing of Chinese academic literatures. -
WINGNUS: Keyphrase extraction utilizing document logical structure
The dataset used in this paper for keyphrase extraction utilizing document logical structure. -
SemEval-2010 Task 5 dataset
The dataset used in this paper for keyphrase extraction from academic articles. -
Leveraging Passage Embeddings for Efficient Listwise Reranking
Passage ranking, which aims to rank each passage in a large corpus according to its relevance to the user's information need expressed in a short query. -
BBC News dataset
The BBC News dataset was used for sentiment analysis of news articles. -
NIPS full paper dataset
The NIPS full paper dataset is a collection of text documents. -
ClueWeb09B
The ClueWeb09B collection is a large-scale web search dataset, containing 31 million web pages, 31 million queries, and 1.5 billion documents. -
AOL Dataset
The AOL dataset contains a collection of queries and documents for search engine evaluation. -
TREC 2004 Robust Retrieval Track
The TREC 2004 Robust Retrieval Track dataset contains a collection of documents and queries for robust retrieval tasks.