Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification

An audio-visual dataset of five crowded scenes: 'Riot', 'Noise-Street', 'Firework-Event', 'Music-Event', and 'Sport-Atmosphere'.

Data and Resources

Cite this as

Lam Pham, Dat Ngo, Phu X. Nguyen, Truong Hoang, Alexander Schindler (2024). Dataset: Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification. https://doi.org/10.57702/aqsojxtn

DOI retrieved: December 17, 2024

Additional Info

Field Value
Created December 17, 2024
Last update December 17, 2024
Defined In https://doi.org/10.48550/arXiv.2112.09172
Author Lam Pham
More Authors
Dat Ngo
Phu X. Nguyen
Truong Hoang
Alexander Schindler
Homepage https://zenodo.org/record/5774751