-
Investigating the Vision Transformer Model for Image Retrieval Tasks
The paper introduces a plug-and-play descriptor that can be effectively adopted for image retrieval tasks without prior initialization or preparation. -
Mask-guided Vision Transformer for Few-Shot Learning
The proposed MG-ViT model is used for few-shot learning on the Agri-ImageNet and ACFR apple detection tasks. -
COVID-VIT: Classification of Covid-19 from CT chest images based on vision tr...
COVID-19 classification from CT chest images based on vision transformer models -
Osteoarthritis Initiative (OAI) dataset
Knee OsteoArthritis (KOA) dataset used for early detection of KOA (KL-0 vs KL-2) using Vision Transformer (ViT) model with selective shuffled position embedding and key-patch... -
ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator
Fine-grained object discrimination using Vision Transformer -
A Novel Vision Transformer with Residual in Self-attention for Biomedical Ima...
Biomedical image classification requires capturing of bio-informatics based on specific feature distribution. In most of such applications, there are mainly challenges due to... -
MVTecAD dataset
The MVTecAD dataset is an image data on 15 products. -
MNIST, CIFAR10, and MVTecAD datasets
The MNIST, CIFAR10, and MVTecAD datasets were used to verify the anomaly detection and localization performance of the proposed method. -
Shape-Sensitive Loss for Catheter and Guidewire Segmentation
A shape-sensitive loss function for catheter and guidewire segmentation using a vision transformer network. -
WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Seg...
Weakly-supervised semantic segmentation using plain Vision Transformer (ViT) for Weakly-supervised Semantic Segmentation (WSSS). -
ImageNet21K
The ImageNet21K dataset is used for training and evaluation of the proposed Circulant Channel-Specific (CCS) token-mixing MLP. -
Diverse instance discovery: Vision-Transformer for instance-aware multi-label...
Multi-label image recognition is a practical and challenging computer vision task. The authors propose a method to leverage the advantages of Transformer with long-range... -
ChestX-ray14
Chest X-rays are widely used to diagnose thoracic diseases, but the lack of detailed information about these abnormalities makes it challenging to develop accurate automated... -
Robustifying Vision Transformer without Retraining from Scratch
Vision Transformer (ViT) is becoming more popular in image processing. We investigate the effectiveness of test-time adaptation (TTA) on ViT, a technique that has emerged to... -
Structural Vision Transformer
Structural Vision Transformer (StructViT) is a vision transformer network that leverages structural self-attention (StructSA) to capture correlation structures in images and...