The FastText dataset is a subword token embedding model. It produces a vector representation of a word based on composing embeddings of the character n-grams composing the word.
The Penn Treebank dataset contains one million words of 1989 Wall Street Journal material annotated in Treebank II style, with 42k sentences of varying lengths.
The dataset used in the paper is not explicitly described, but it is mentioned that the authors tested the proposed method on three real data sets for the most relevant security...