Ubuntu Dialogue Corpus (UDC)

The Ubuntu Dialogue Corpus (UDC) dataset was extracted from the Ubuntu Relay Chat Channel. Although the topics in the dataset are not as diverse as in the MTC, the dataset is very large, containing about 1.85 million conversations with an average of 5 utterances per conversation.

BibTex: