CMR Scaling Law

doi:doi:10.57702/8vx49sro

CMR Scaling Law

The dataset used in the paper is a mixture of general corpus and domain-specific corpus, with a power-law relationship between loss, mixture ratio, and training tokens scale.

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Jiawei Gu, Zacc Yang, Chuanghao Ding, Rui Zhao, Fei Tan (2025). Dataset: CMR Scaling Law. https://doi.org/10.57702/8vx49sro

DOI retrieved: January 2, 2025

Additional Info

Field	Value
Created	January 2, 2025
Last update	January 2, 2025
Defined In	https://doi.org/10.48550/arXiv.2407.17467
Author	Jiawei Gu
More Authors	Zacc Yang Chuanghao Ding Rui Zhao Fei Tan