1 dataset found

Groups: Continual Pre-training Formats: JSON

Filter Results
  • CMR Scaling Law

    The dataset used in the paper is a mixture of general corpus and domain-specific corpus, with a power-law relationship between loss, mixture ratio, and training tokens scale.