Dataset - LDM

CLEVRER

The CLEVRER dataset is a dataset for video reasoning, where videos are presented to the model, along with a set of related questions, and the model's outputs are the answers to...
- Dataset
- JSON
Singapore Rapid Transit Systems Regulations

Singapore Rapid Transit Systems Regulations is a collection of regulations proclaimed by the Singapore government.
- Dataset
- JSON
Universal and transferable adversarial attacks on aligned language models

AdvBench is a dataset for evaluating the safety of large language models.
- Dataset
- JSON
Social Chemistry 101: Learning to reason about social and moral norms

Social Chemistry 101 is a dataset that encompasses diverse social norms.
- Dataset
- JSON
Aligning AI with shared human values

ETHICS is a benchmark for evaluating a language model's knowledge of fundamental ethical concepts.
- Dataset
- JSON
Crows-pairs: A challenge dataset for measuring social biases in masked langua...

CrowS-Pairs is a challenge dataset for measuring social biases in masked language models.
- Dataset
- JSON
ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evalua...

ALI-Agent is an evaluation framework that leverages the autonomous abilities of LLM-powered agents to probe adaptive and long-tail risks in target LLMs.
- Dataset
- JSON
Opportunity activity recognition dataset

Opportunity activity recognition dataset contains questions answerable using Wikidata as the knowledge graph, focusing on questions with a single entity and relation.
- Dataset
- JSON
Helpful and Harmless

The dataset used for training and evaluation of the proposed RRHF paradigm.
- Dataset
- JSON
DBpedia

DBpedia is a public knowledge graph which is derived from structured information in Wikipedia, mainly infoboxes.
- Dataset
- JSON
TREC Deep Learning 2020

Large-scale passage retrieval aims to fetch relevant passages from a million- or billion-scale collection for a given query to meet users’ information needs, serving as an...
- Dataset
- JSON
TREC Deep Learning 2019

Large-scale passage retrieval aims to fetch relevant passages from a million- or billion-scale collection for a given query to meet users’ information needs, serving as an...
- Dataset
- JSON
GQA

The GQA dataset is a visual question answering dataset that characterizes in compositional question answering and visual reasoning about real-world images.
- Dataset
- JSON
Piazza QA dataset

A dataset of 50 question-answer pairs from a programming languages course at a large public university.
- Dataset
- JSON
Paralex

Propose a method for generating paraphrases of English questions that retain the original intent but use a different surface form.
- Dataset
- JSON
Factorising Meaning and Form for Intent-Preserving Paraphrasing

Propose a method for generating paraphrases of English questions that retain the original intent but use a different surface form.
- Dataset
- JSON
TGIF-QA

The TGIF-QA dataset consists of 165165 QA pairs chosen from 71741 animated GIFs. To evaluate the spatiotemporal reasoning ability at the video level, TGIF-QA dataset designs...
- Dataset
- JSON
TREC DL

TREC 2019 Deep Learning Track has the same training and dev set as MS MARCO, but replaces the test set with a novel set produced by TREC.
- Dataset
- JSON
NQ

An open-domain QA dataset that consists of a question, a retrieved article, a selected paragraph from the article, and a short answer inferable from the paragraph.
- Dataset
- JSON
PathVQA

The dataset used in the paper is a set of sequential vision-and-language tasks, where each task consists of an image and a text input.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

196 datasets found