1 dataset found

Tags: Multiscale Vision Transformers

Filter Results
  • Multiscale Vision Transformers

    Multiscale Vision Transformers (MViT) for video and image recognition, by connecting the seminal idea of multiscale feature hierarchies with transformer models.
You can also access this registry using the API (see API Docs).