FastText

doi:doi:10.57702/ce7c3j4u

You're currently viewing an old version of this dataset. To see the current version, click here.

FastText

The FastText dataset is a subword token embedding model. It produces a vector representation of a word based on composing embeddings of the character n-grams composing the word.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov (2024). Dataset: FastText. https://doi.org/10.57702/ce7c3j4u

DOI retrieved: December 16, 2024

Additional Info

Field	Value
Created	December 16, 2024
Last update	December 16, 2024
Defined In	https://doi.org/10.48550/arXiv.2010.15939
Citation	https://doi.org/10.48550/arXiv.1902.07181 https://doi.org/10.48550/arXiv.2207.01246 https://doi.org/10.48550/arXiv.1904.12683
Author	Piotr Bojanowski
More Authors	Edouard Grave Armand Joulin Tomas Mikolov
Homepage	https://arxiv.org/abs/1603.08742