Multilingual Corpus - Groups

OSCAR corpus

The dataset used in this study is the OSCAR corpus, which is a multilingual corpus that is obtained by filtering of the Common Crawl corpus.
- Dataset
- JSON
Parallel Meaning Bank

A semantically annotated parallel corpus for English, German, Italian, and Dutch where sentences are aligned with scoped meaning representations in order to capture the...
- Dataset
- JSON

Before browse our site, please accept our cookies policy

2 datasets found