Matching Latent Encoding for Audio-Text based Keyword Spotting

doi:doi:10.57702/46jr1588

Matching Latent Encoding for Audio-Text based Keyword Spotting

The proposed end-to-end model architecture for flexible keyword spotting, consisting of encoder, projector, and audio-text aligner modules.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Kumari Nishu, Minsik Cho, Devang Naik (2024). Dataset: Matching Latent Encoding for Audio-Text based Keyword Spotting. https://doi.org/10.57702/46jr1588

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Defined In	https://doi.org/10.48550/arXiv.2306.05245
Author	Kumari Nishu
More Authors	Minsik Cho Devang Naik