Dataset - LDM

Singapore Rapid Transit Systems Regulations

Singapore Rapid Transit Systems Regulations is a collection of regulations proclaimed by the Singapore government.
- Dataset
- JSON
Universal and transferable adversarial attacks on aligned language models

AdvBench is a dataset for evaluating the safety of large language models.
- Dataset
- JSON
Social Chemistry 101: Learning to reason about social and moral norms

Social Chemistry 101 is a dataset that encompasses diverse social norms.
- Dataset
- JSON
Aligning AI with shared human values

ETHICS is a benchmark for evaluating a language model's knowledge of fundamental ethical concepts.
- Dataset
- JSON
Crows-pairs: A challenge dataset for measuring social biases in masked langua...

CrowS-Pairs is a challenge dataset for measuring social biases in masked language models.
- Dataset
- JSON
ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evalua...

ALI-Agent is an evaluation framework that leverages the autonomous abilities of LLM-powered agents to probe adaptive and long-tail risks in target LLMs.
- Dataset
- JSON
Opportunity activity recognition dataset

Opportunity activity recognition dataset contains questions answerable using Wikidata as the knowledge graph, focusing on questions with a single entity and relation.
- Dataset
- JSON
Helpful and Harmless

The dataset used for training and evaluation of the proposed RRHF paradigm.
- Dataset
- JSON
DocRED

DocRED is a large-scale human-annotated dataset for document-level RE, which is constructed from Wikipedia and Wikidata.
- Dataset
- JSON
TREC Deep Learning 2020

Large-scale passage retrieval aims to fetch relevant passages from a million- or billion-scale collection for a given query to meet users’ information needs, serving as an...
- Dataset
- JSON
TREC Deep Learning 2019

Large-scale passage retrieval aims to fetch relevant passages from a million- or billion-scale collection for a given query to meet users’ information needs, serving as an...
- Dataset
- JSON
WN18RR

Knowledge graphs store a wealth of knowledge from the real world into structured graphs, which consist of collections of triplets, and each triplet (h, r, t) represents that...
- Dataset
- JSON
SQuAD

The dataset used in the paper is a multiple-choice reading comprehension dataset, which includes a passage, question, and answer. The passage is a script, and the question is a...
- Dataset
- JSON
SimpleQuestion Dataset

The dataset used in the paper is a collection of data for the Simple Question dataset, which contains questions answerable using Wikidata as the knowledge graph.
- Dataset
- JSON
Collective classiﬁcation in network data

Collective classiﬁcation in network data.
- Dataset
- JSON
FUNSD dataset

FUNSD dataset contains questions answerable using Wikidata as the knowledge graph, focusing on questions with a single entity and relation.
- Dataset
- JSON
CORD dataset

CORD dataset contains questions answerable using Wikidata as the knowledge graph, focusing on questions with a single entity and relation.
- Dataset
- JSON
Neural Collaborative Filtering

The dataset is used for neural collaborative filtering, which is a type of collaborative filtering that uses neural networks to learn the relationships between users and items.
- Dataset
- JSON
IMDB-RLHF-Pair dataset

The IMDB-RLHF-Pair dataset is generated by IMDB, and responses with positive sentiment are preferred.
- Dataset
- JSON
Stack-Exchange-Paired dataset

The Stack-Exchange-Paired dataset contains questions and answers from the Stack Overflow dataset, where answers with more votes are preferred.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

67 datasets found