HYPOTHESIS STITCHER FOR END-TO-END SPEAKER-ATTRIBUTED ASR ON LONG-FORM MULTI-TALKER RECORDINGS

An end-to-end (E2E) speaker-attributed automatic speech recognition (SA-ASR) model was proposed recently to jointly perform speaker counting, speech recognition and speaker identiļ¬cation.

Data and Resources

Cite this as

Xuankai Chang, Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka (2024). Dataset: HYPOTHESIS STITCHER FOR END-TO-END SPEAKER-ATTRIBUTED ASR ON LONG-FORM MULTI-TALKER RECORDINGS. https://doi.org/10.57702/h4tkatea

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2101.01853
Author Xuankai Chang
More Authors
Naoyuki Kanda
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Takuya Yoshioka