You're currently viewing an old version of this dataset. To see the current version, click here.

CSQA

The CSQA dataset is a widely used benchmark dataset for conversational KBQA, consisting of around 200K dialogues where training set, validation set and testing set contain 153K, 16K and 28K dialogues, respectively.

Data and Resources

This dataset has no data

Cite this as

Tim Hartill, Joshua Bensemann, Michael Witbrock, Patricia J. Riddle (2025). Dataset: CSQA. https://doi.org/10.57702/oes2pbwk

Private DOI This DOI is not yet resolvable.
It is available for use in manuscripts, and will be published when the Dataset is made public.

Additional Info

Field	Value
Created	January 2, 2025
Last update	January 2, 2025
Defined In	https://doi.org/10.48550/arXiv.2306.06872
Citation	https://doi.org/10.48550/arXiv.2311.12337 https://doi.org/10.48550/arXiv.2302.12246 https://doi.org/10.48550/arXiv.2309.03882
Author	Tim Hartill
More Authors	Joshua Bensemann Michael Witbrock Patricia J. Riddle
Homepage	https://arxiv.org/abs/1909.04558