-
NYT and WebNLG
NYT and WebNLG are widely used datasets for relational triple extraction. -
VisualBERT
The VisualBERT dataset is a pre-trained model for vision-and-language tasks, which is built on top of PyTorch. -
Task Driven Image Understanding Challenge (TDIUC)
The Task Driven Image Understanding Challenge (TDIUC) dataset is a large VQA dataset with 12 more fine-grained categories proposed to compensate for the bias in distribution of... -
WebQA, CEval, CMMLU, and MMLU
WebQA, CEval, CMMLU, and MMLU for general chat -
SimpleQuestion
The SimpleQuestion dataset is a dataset for question answering, consisting of 100,000 questions and 1,000,000 answers. -
Multi-Image VQA for Unsupervised Anomaly Detection
Unsupervised anomaly detection dataset for multi-image visual question answering -
OpenOrca dataset
The dataset used for the Vectara hallucination task, containing OpenOrca questions. -
Youtube2Text-QA
Video question answering task, which requires machines to answer questions about videos in a natural language form. -
Universal Conceptual Cognitive Annotation (UCCA)
The Universal Conceptual Cognitive Annotation (UCCA) dataset is a graph-based semantic annotation scheme based on typological linguistic principles.