DropIT: DROPPING INTERMEDIATE TENSORS FOR MEMORY-EFFICIENT DNN TRAINING

The dataset used in the paper is not explicitly described, but it is mentioned that the authors used DeiT and ViT models on ImageNet-1k and CIFAR-100 datasets.

BibTex: