FastText

The FastText dataset is a subword token embedding model. It produces a vector representation of a word based on composing embeddings of the character n-grams composing the word.

Data and Resources

Cite this as

Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov (2024). Dataset: FastText. https://doi.org/10.57702/ce7c3j4u

DOI retrieved: December 16, 2024

Additional Info

Field Value
Created December 16, 2024
Last update December 16, 2024
Defined In https://doi.org/10.48550/arXiv.2010.15939
Citation
  • https://doi.org/10.48550/arXiv.1902.07181
  • https://doi.org/10.48550/arXiv.2207.01246
  • https://doi.org/10.48550/arXiv.1904.12683
Author Piotr Bojanowski
More Authors
Edouard Grave
Armand Joulin
Tomas Mikolov
Homepage https://arxiv.org/abs/1603.08742