Dataset - LDM

Russian Noun Dataset

The dataset used for clustering contains the 2000 most frequent nouns in the Russian Web corpus.
- Dataset
- JSON
Spanish Noun Dataset

The dataset used for clustering contains the 2000 most frequent nouns in the Spanish Gigaword corpus.
- Dataset
- JSON
English Noun Dataset

The dataset used for clustering contains the 2000 most frequent nouns in the British National Corpus (BNC) and the English Gigaword corpus.
- Dataset
- JSON

You can also access this registry using the API (see API Docs).

3 datasets found