Continuous sign language recognition (SLR) deals with unaligned video-text pair and uses the word error rate (WER), i.e., edit distance, as the main evaluation metric.
WLASL is the latest ASL dataset with a larger vocabulary size of 2,000. It consists of 14,289, 3,916, and 2,878 samples in the training, dev, and test set, respectively.
MSASL is an American sign language (ASL) dataset with a vocabulary size of 1,000. It consists of 16,054, 5,287, and 4,172 samples in the training, development (dev), and test...