Dataset - LDM

Towards Trustworthy AutoGrading of Short, Multi-lingual, Multi-type Answers

The dataset consists of approximately 10 million question-answer pairs from multiple languages covering diverse fields such as math and language, and strong variation in...
- Dataset
- JSON
MSRA-TD500

The MSRA-TD500 dataset is a benchmark for scene text detection, containing 700 training images and 200 test images, with multi-lingual, arbitrary-oriented and long text lines.
- Dataset
- JSON
ICDAR 2017 MLT

ICDAR 2017 MLT is a large scale multi-lingual text dataset, which includes 7200 training images, 1800 validation images and 9000 testing images. The dataset is composed of...
- Dataset
- JSON
QuixBugs

QuixBugs is a multi-lingual program repair benchmark set based on the Quixey Challenge.
- Dataset
- JSON
Conditional Generative Matching Model for Multi-lingual Reply Suggestion

A Conditional Generative Matching Model for Multi-lingual Reply Suggestion
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

5 datasets found