GLUE

Pre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words. Previous works have shown that incorporating span-level information over consecutive words in pre-training could further improve the performance of PrLMs.

Data and Resources

Cite this as

Yi Yang, Chen Zhang, Dawei Song (2024). Dataset: GLUE. https://doi.org/10.57702/byuoeozp

DOI retrieved: December 2, 2024

Additional Info

Field Value
Created December 2, 2024
Last update December 2, 2024
Defined In https://doi.org/10.48550/arXiv.2201.04467
Citation
  • https://doi.org/10.48550/arXiv.2305.17197
  • https://doi.org/10.48550/arXiv.1909.03004
  • https://doi.org/10.48550/arXiv.2310.14110
  • https://doi.org/10.48550/arXiv.2405.04513
  • https://doi.org/10.1073/pnas.2215907120
  • https://doi.org/10.48550/arXiv.2210.03923
  • https://doi.org/10.48550/arXiv.2211.07350
  • https://doi.org/10.48550/arXiv.2106.08823
  • https://doi.org/10.48550/arXiv.1904.12166
  • https://doi.org/10.48550/arXiv.2211.09744
  • https://doi.org/10.48550/arXiv.2108.12848
Author Yi Yang
More Authors
Chen Zhang
Dawei Song
Homepage https://glue.mlt.io/