VoxCeleb

doi:doi:10.57702/9zson5bd

VoxCeleb

Speaker verification systems experience significant performance degradation when tasked with short-duration trial recordings. To address this challenge, a multi-scale feature fusion approach has been proposed to effectively capture speaker characteristics from short utterances.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Arsha Nagrani, Joon Son Chung, Andrew Zisserman (2024). Dataset: VoxCeleb. https://doi.org/10.57702/9zson5bd

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Defined In	https://doi.org/10.48550/arXiv.2310.11004
Citation	https://doi.org/10.48550/arXiv.2306.00952 https://doi.org/10.48550/arXiv.2211.00511 https://doi.org/10.48550/arXiv.2308.14049 https://doi.org/10.48550/arXiv.2405.04296 https://doi.org/10.48550/arXiv.2401.08415 https://doi.org/10.48550/arXiv.2106.06362 https://doi.org/10.48550/arXiv.1903.10195 https://doi.org/10.48550/arXiv.1810.04826 https://doi.org/10.48550/arXiv.1906.08556 https://doi.org/10.48550/arXiv.1705.02966 https://doi.org/10.48550/arXiv.2207.04834 https://doi.org/10.48550/arXiv.2012.08261 https://doi.org/10.48550/arXiv.2102.06291 https://doi.org/10.48550/arXiv.2005.08781 https://doi.org/10.48550/arXiv.2110.02411 https://doi.org/10.21437/IberSPEECH.2022-34 https://doi.org/10.48550/arXiv.2202.09082 https://doi.org/10.48550/arXiv.2401.09146 https://doi.org/10.48550/arXiv.1806.08621
Author	Arsha Nagrani
More Authors	Joon Son Chung Andrew Zisserman
Homepage	https://www.voxceleb.org/