Vietnamese Speech Dataset for Named Entity Recognition

The first Vietnamese speech dataset for NER task, and the first pre-trained public large-scale monolingual language model for Vietnamese that achieved the new state-of-the-art for the Vietnamese NER task by 1.3% absolute F1 score comparing to the latest study.

Data and Resources

Cite this as

Thai Binh Nguyen, Quang Minh Nguyen, Thi Thu Hien Nguyen, Quoc Truong Do, Chi Mai Luong (2024). Dataset: Vietnamese Speech Dataset for Named Entity Recognition. https://doi.org/10.57702/v9h5ecu8

DOI retrieved: December 3, 2024

Additional Info

Field Value
Created December 3, 2024
Last update December 3, 2024
Defined In https://doi.org/10.48550/arXiv.2010.00198
Author Thai Binh Nguyen
More Authors
Quang Minh Nguyen
Thi Thu Hien Nguyen
Quoc Truong Do
Chi Mai Luong