-
WikiSum Evaluation Dataset
A dataset of 1,527 Wikipedia biographies about women, where information on the internet is not as easily retrieved. -
Problems with evaluation of word embeddings using word similarity tasks
This dataset has no description
-
USR-TopicalChat
This dataset is used for dialogue response evaluation. -
USR-PersonaChat
This dataset is used for dialogue response evaluation. -
Dailydialog-Eval
This dataset is used for dialogue response evaluation. -
FlyingChairs
A dataset for optical flow evaluation, including a naturalistic open source movie. -
Open Graph Benchmark: Datasets for Machine Learning on Graphs
Open Graph Benchmark: Datasets for machine learning on graphs. -
Synthetic Networks
The dataset used in the paper is a synthetic network generated under four network models: SBM, DCBM, RDPG, and latent space model. -
Contact Network of Secondary School Students
The dataset used in the paper is a collection of real and synthetic networks for community detection evaluation. -
Scientific Collaboration Networks
The dataset used in the paper is a collection of real and synthetic networks for community detection evaluation. -
Community Detection Evaluation
The dataset used in the paper is a collection of real and synthetic networks for community detection evaluation. -
A Modularized Evaluation for Topic Popularity Prediction
Topic popularity prediction in social networks has drawn much attention recently. Various elegant models have been proposed for this issue. However, different datasets and... -
HaluEval-Sum
The dataset used in this paper is HaluEval-Sum, a large-scale hallucination evaluation benchmark for large language models. -
Text Simplification Datasets: Exploration
Text Simplification datasets have limitations and need to be improved to build more robust models. -
GPT-4 Evaluation Dataset
The dataset used for the evaluation of GPT-4's performance in systematic review tasks. -
Pitfalls of graph neural network evaluation
Pitfalls of graph neural network evaluation -
Surprise Test Set
The surprise test set is used for evaluating the performance of the proposed system.