-
GTFS-Madrid-Bench
A benchmark to evaluate declarative KG construction engines that can be used to provide access mechanisms to (virtual) knowledge graphs. Our proposal introduces several... -
WN18RR Benchmark
WN18RR is a link prediction dataset created from WN18, which is a subset of WordNet. WN18 consists of 18 relations and 40,943 entities. However, many text triples are obtained... -
SPaRKLE: Symbolic caPtuRing of knowledge for Knowledge graph enrichment with ...
SPaRKLE is a hybrid method that combines symbolic and mathematical methodologies while leveraging Partial Completness Assumption (PCA) heuristics to capture implicit information... -
The Family KG
Statistical predicate invention is considered a key problem in statistical relational learning. SPI involves discovering new concepts, properties, and relations within... -
The French Royalty KG
The French Royalty KG is created by extracting information about French royal families from DBpedia, and for each person, we added the class dbo:Person as well as different... -
FB15k-237 Benchmark
FB15k-237 is a link prediction dataset created from FB15k. While FB15k consists of 1,345 relations, 14,951 entities, and 592,213 triples, many triples are inverses that cause... -
Berlin SPARQL Benchmark (BSBM)
The SPARQL Query Language for RDF and the SPARQL Protocol for RDF are implemented by a growing number of storage systems and are used within enterprise and open web settings. As... -
The Lehigh University Benchmark (LUBM)
The Lehigh University Benchmark is developed to facilitate the evaluation of Semantic Web repositories in a standard and systematic way. The benchmark is intended to evaluate... -
Waterloo SPARQL Diversity Test Suite (WatDiv) Benchmark
WatDiv is a benchmark designed to measure how an RDF data management system performs across a wide spectrum of SPARQL queries with varying structural characteristics and... -
WN18 (WordNet18)
The WN18 dataset has 18 relations scraped from WordNet for roughly 41,000 synsets, resulting in 141,442 triplets. It was found out that a large number of the test triplets can... -
FB15k (Freebase 15K)
The FB15k dataset contains knowledge base relation triples and textual mentions of Freebase entity pairs. It has a total of 592,213 triplets with 14,951 entities and 1,345... -
YAGO3-10 (Yet Another Great Ontology 3-10)
YAGO3-10 is benchmark dataset for knowledge base completion. It is a subset of YAGO3 (which itself is an extension of YAGO) that contains entities associated with at least ten... -
Entity Summarization Benchmark (ESBM)
ESBM (short for Entity Summarization BenchMark) is a benchmark for evaluating algorithms for entity summarization, aka entity summarizers. The latest version is on 2019-12-08. -
Monash Time Series Forecasting Repository
All datasets contain univariate time series and they are available in a new format that we name as .tsf, pioneered by the sktime .ts format. -
SDM-Genomic-Dataset
This benchmark is created by randomly sampling data records from somatic mutation data collected in COSMIC (https://cancer.sanger.ac.uk/cosmic). SDM-Genomic-Datasets include... -
KGQA-Datasets
This dataset is a collection of existing KGQA datasets in the form of the huggingface datasets library, aiming to provide an easy-to-use access to them.