Explainable Deep Clustering for Monaural Speech Separation

The proposed X-DC model uses a dataset of mixed speech signals of two, four, or eight speakers.

BibTex: