Datasets Activity Stream About Order by Relevance Name Ascending Name Descending Last Modified Go 2 datasets found Groups: Information Retrieval Organizations: No Organization Formats: JSON Filter Results MathMLBen The MathMLBen dataset is used to evaluate the performance of formula embedding techniques for mathematical information retrieval. Dataset JSON arXMLiv 2018 The arXMLiv 2018 dataset is an HTML collection of the arXiv.org preprint archive, used as a training corpus for word embedding techniques. Dataset JSON