-
DeepSpeech
The DeepSpeech dataset used for evaluation of the proposed watermarking scheme. -
Speech Pattern Based Black-Box Model Watermarking for Automatic Speech Recogn...
The proposed black-box model watermarking framework for protecting the IP of ASR models. -
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction o...
Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units -
wav2vec: Unsupervised Pre-Training for Speech Recognition
Unsupervised Pre-Training for Speech Recognition -
Speech Intelligibility Prediction with DNN-based Performance Measures
The dataset used for speech intelligibility prediction with DNN-based performance measures -
Transformer based Whisper Bangla ASR model
A transformer-based Whisper Bangla ASR model -
BD-4SK-ASR
The dataset used in this paper is BD-4SK-ASR, an experimental dataset which is used in the first attempt in developing an ASR system for Sorani Kurdish. -
IWSLT2018 Speech Translation Task
The dataset used in the paper is the IWSLT2018 speech translation task, which consists of five parts: TED corpus, Speech-translation TED corpus, TED LIUM corpus, WMT18 data and... -
Wall Street Journal
The Wall Street Journal dataset is used for syntactic linearization. It contains a large corpus of news articles with their corresponding syntactic trees. -
Correction Focused Language Model Training for Speech Recognition
Language models have been commonly adopted to boost the performance of automatic speech recognition (ASR) particularly in domain adaptation tasks. Conventional way of LM... -
InterFormer: Interactive Local and Global Features Fusion for Automatic Speec...
The local and global features are both essential for automatic speech recognition (ASR). Many recent methods have verified that simply combining local and global features can... -
INT8 Winograd Acceleration for Conv1D Equipped ASR Models Deployed on Mobile ...
The dataset used in this paper is a Conv1D equipped ASR model deployed on mobile devices. -
LIBRIHEAVY: A 50,000 HOURS ASR CORPUS WITH PUNCTUATION CASING AND CONTEXT
Libriheavy is a large-scale ASR corpus consisting of 50,000 hours of read English speech derived from LibriVox. To the best of our knowledge, Libriheavy is the largest... -
ATIS dataset
The ATIS dataset is a benchmark dataset for spoken language understanding, consisting of audio recordings and corresponding manual transcripts about humans asking for flight... -
TEDLIUM Corpus
The TEDLIUM corpus is a large-volume corpus used for speech recognition and text summarization. -
How2 Dataset
The How2 dataset consists of summarizations of How2 videos taken from YouTube. -
TED Speech Summarization Corpus
Speech summarization, which generates a text summary from speech, can be achieved by combining automatic speech recognition (ASR) and text summarization (TS).