Constructing Multi-Modal Dialogue Dataset by Replacing Text with Semantically Relevant Images

A multi-modal dialogue dataset created by replacing text with semantically relevant images.

BibTex: