Movie Triples Corpus (MTC)

doi:doi:10.57702/yv676lps

Movie Triples Corpus (MTC)

The Movie Triples Corpus (MTC) dataset was derived from the Movie-DiC dataset by Banchs (2012). Although this dataset spans a wide range of topics with few spelling mistakes, its small size of only about 240,000 dialogue triples makes it difficult to train a dialogue model, as pointed out by Serban et al. (2016).

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Oluwatobi Olabiyi, Alan Salimov, Anish Khazane, Erik T. Mueller (2025). Dataset: Movie Triples Corpus (MTC). https://doi.org/10.57702/yv676lps

DOI retrieved: January 2, 2025

Additional Info

Field	Value
Created	January 2, 2025
Last update	January 2, 2025
Defined In	https://doi.org/10.48550/arXiv.1805.11752
Author	Oluwatobi Olabiyi
More Authors	Alan Salimov Anish Khazane Erik T. Mueller
Homepage	https://github.com/julianser/hed-dlg-truncated