Dataset - Groups

Collective Classification in Network Data

The Collective Classification in Network Data dataset is used for graph neural network research.
- Dataset
- JSON
Cora, Citeseer, and Polblogs datasets

The Cora, Citeseer, and Polblogs datasets are widely used for graph neural network research.
- Dataset
- JSON
CiteSeerX Name Disambiguation Dataset

The dataset contains 10 highly ambiguous name references with 1091 documents and 74 distinct real-life authors.
- Dataset
- JSON
Arnetminer Name Disambiguation Dataset

The dataset contains 10 highly ambiguous name references with 1091 documents and 74 distinct real-life authors.
- Dataset
- JSON
TSPLIB

The Travelling Salesman Problem (TSP) and Vehicle Routing Problem (VRP) are the two most prevalent routing problems.
- Dataset
- JSON
Misinformation Detection Dataset

A dataset of 248 well-cited papers in the field of misinformation detection
- Dataset
- JSON
Mozart Dataset

The dataset used for training the model consists of 13 pieces of Mozart, 989 pieces for validation, and 11,821 pieces for testing.
- Dataset
- JSON
KEEL repository

The Knowledge Extraction based on Evolutionary Learning (KEEL) repository contains 64 datasets for the experiments.
- Dataset
- JSON
Gemma: Open models based on gemini research and technology

This dataset contains a large corpus of text for training and evaluating large language models.
- Dataset
- JSON
Llama 2: Open foundation and fine-tuned chat models

This dataset contains a large corpus of text for training and evaluating large language models.
- Dataset
- JSON
harmless/harmful anchor datasets

This dataset contains 100 harmless and 100 harmful anchor prompts for evaluating the performance of large language models.
- Dataset
- JSON
CCCS-CIC-AndMal-2020

The CCCS-CIC-AndMal-2020 dataset comprises 400K android apps, to test and assess the suggested methodology.
- Dataset
- JSON
Colosseum dataset

The Colosseum dataset is a dataset of network trafﬁc ﬂow level, which is used to train and test the trafﬁc steering algorithm.
- Dataset
- JSON
PMLB

The dataset used in the paper is a collection of 156 benchmark datasets for machine learning model evaluation.
- Dataset
- JSON
Road Network Dataset

A dataset for testing the proposed algorithm, consisting of a road network with 1719 vertices and 2280 edges.
- Dataset
- JSON
Generated Instances for SUTP

The dataset contains 420 instances with varying number of big trains, each with 1 to 4 unit trains, and different configurations of dumpers, conveyors, and stackers.
- Dataset
- JSON
Stanford Glaucoma Dataset

A dataset of OCT scans consisting of glaucoma and non-glaucomatous cases obtained from four tertiary care eye hospitals located in four different countries.
- Dataset
- JSON
Enron Corpus

The Enron corpus is a dataset of over 17K Excel Spreadsheets extracted from the Enron email corpus.
- Dataset
- JSON
Google Sheets Dataset

The dataset is constructed from a corpus of Google Sheets publicly shared within our organization. We collected 46K Google Sheets with formulas, and split them into 42K for...
- Dataset
- JSON
tvtLANE

A hybrid spatial-temporal deep learning architecture for lane detection
- Dataset
- JSON

30 datasets found