3 datasets found

Tags: Indian languages

Filter Results
  • MSR

    The MSR dataset is a widely used vulnerability detection dataset, consisting of 10,900 vulnerable examples and 177,736 non-vulnerable examples.
  • Vakyansh

    The dataset is used for training and testing the proposed punctuation restoration and inverse text normalization models.
  • Samanantar dataset

    The Samanantar dataset containing 49.6 million sentence pairs between English and 11 Indian languages.
You can also access this registry using the API (see API Docs).