The HowToStep dataset is a large-scale instructional dataset constructed for training, by transforming the original transcripts of HTM-370K into around 4M ordered instructional...
How2Sign is a large-scale continuous American Sign Language (ASL) dataset. After removing invalid text-video pairs, we retain 31019, 1738, and 2348 available pairs in the...
The INRIA YouTube Instructional Videos dataset contains five tasks of different instructional domains: “making coffee”, “changing a car tire”, “CPR”, “jumping a car”, and...
The COIN dataset is a large-scale instructional video dataset that contains 100 hours of video. The dataset is used for instructional video analysis and understanding.