-
LNMap: Departures from Isomorphic Assumption in Bilingual Lexicon Induction
LNMap: Departures from isomorphic assumption in bilingual lexicon induction through non-linear mapping in latent space. -
Learning Principled Bilingual Word Embeddings
Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. -
RAPO: An Adaptive Ranking Paradigm for Bilingual Lexicon Induction
Bilingual lexicon induction induces the word translations by aligning independently trained word embeddings in two languages. -
Exponential Family Embeddings
Word embeddings are a powerful approach for capturing semantic similarity among terms in a vocabulary. In this paper, we develop exponential family embeddings, a class of... -
Intrinsic evaluations of word embeddings: What can we do better?
This dataset has no description
-
Problems with evaluation of word embeddings using word similarity tasks
This dataset has no description
-
Improving zero-shot learning by mitigating the hubness problem
This dataset has no description
-
Distributed representations of words and phrases and their compositionality
The word2vec dataset is a word embedding dataset that contains 3 million words. -
Improving distributional similarity with lessons learned from word embeddings
This dataset has no description
-
Learning Word Embeddings from the Portuguese Twitter Stream: A Study of some ...
This paper describes a preliminary study for producing and distributing a large-scale database of embeddings from the Portuguese Twitter stream. -
Word2Vec Dataset
The Word2Vec dataset. -
Massively Multilingual Word Embeddings
Massively multilingual word embeddings. -
Learning Sentiment-Specific Word Embeddings from Distant Supervision
Sentiment-specific word embeddings dataset -
Wikipedia2Vec dataset
The dataset used in the paper is the Wikipedia2Vec dataset, which contains word embeddings. -
Scientific Articles Corpus
The dataset used in this research is a large-scale academic corpus containing titles and abstracts of approximately 70 million scientific articles. -
Polyglot Wikipedia
The dataset used for training and testing the MVLSA model.