The ETAPE corpus for the evaluation of speech-based TV content processing in ...
The ETAPE corpus for the evaluation of speech-based TV content processing in the French language contains TV broadcasts. -
French TreeBank
The French TreeBank (FTB) is a treebank for French, containing 650k sentences. -
Common Crawl
The Common Crawl (CC) project browses and indexes all content available online. It generates 200-300 TiB of data per month (around 5% of which is in French), and constitutes the...