Vietnamese Speech Dataset for Named Entity Recognition

doi:doi:10.57702/v9h5ecu8

Vietnamese Speech Dataset for Named Entity Recognition

The first Vietnamese speech dataset for NER task, and the first pre-trained public large-scale monolingual language model for Vietnamese that achieved the new state-of-the-art for the Vietnamese NER task by 1.3% absolute F1 score comparing to the latest study.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Thai Binh Nguyen, Quang Minh Nguyen, Thi Thu Hien Nguyen, Quoc Truong Do, Chi Mai Luong (2024). Dataset: Vietnamese Speech Dataset for Named Entity Recognition. https://doi.org/10.57702/v9h5ecu8

DOI retrieved: December 3, 2024

Additional Info

Field	Value
Created	December 3, 2024
Last update	December 3, 2024
Defined In	https://doi.org/10.48550/arXiv.2010.00198
Author	Thai Binh Nguyen
More Authors	Quang Minh Nguyen Thi Thu Hien Nguyen Quoc Truong Do Chi Mai Luong