-
COMET: A neural framework for MT evaluation
The COMET dataset contains human-annotated scores for machine translation candidates. -
WMT2020 Metrics Shared Task
The WMT2020 Metrics Shared Task dataset contains human-annotated scores for machine translation candidates. -
RoBLEURT Submission for the WMT2021 Metrics Task
RoBLEURT is a robustly optimizing the training of BLEURT, a trainable metric model for evaluating the semantic consistency between machine translation candidates and golden... -
Umsuka English-isiZulu Parallel Corpus
The Umsuka English-isiZulu Parallel Corpus provides a novel, high-quality parallel dataset for machine translation, containing English sentences sampled from both News Crawl... -
WMT’14 English-French and WMT’19 German-English datasets
Two types of datasets: traditional bilingual and domain adaptation datasets. -
WMT 2020 Sentence-Level Direct Assessment dataset
The dataset used in the competition for Sentence-Level Direct Assessment shared task is composed of data extracted from Wikipedia for six language pairs, consisting of... -
English-Hindi Parallel Corpus
The dataset used for training and testing the machine translation systems. -
English-Hindi Outputs Quality Estimation using Naive Bayes Classifier
The dataset used for training and testing the Naive Bayes classifier for quality estimation of English-Hindi outputs. -
Newstest2012 and Newstest2013
Newstest2012 and Newstest2013 are used for testing the proposed approach. -
WMT2021 Shared Task on Machine Translation Using Terminologies
The dataset used in this paper is the WMT2021 shared task on machine translation using terminologies, which consists of 4.53M sentence pairs. -
WMT2014 German-English Translation Task
The dataset used in this paper is the WMT2014 German-English translation task, which consists of 4.51M parallel sentence pairs. -
Shifts Machine Translation dataset
The Shifts Machine Translation dataset consists of pairs of source and target sentences in English and Russian. -
ParCor Dataset
The ParCor dataset is a parallel corpus of annotated pronouns. -
WIT3 Parallel Corpus
The WIT3 parallel corpus is a large-scale corpus of transcribed and translated talks. -
WMT 2016 Task on Cross-Lingual Pronoun Prediction
The WMT 2016 task on cross-lingual pronoun prediction is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a... -
WMT dataset
The dataset used in the paper is the WMT dataset, which contains machine translation data for various language pairs. -
WMT’17 metrics task
The dataset used in the paper for validation studies of automatic metrics in natural language generation evaluation -
Recurrent Continuous Translation Models
A neural machine translation toolkit that uses maximum likelihood as the training criterion.