-
Unstructured Social Activity Attribute (USAA)
A video dataset of 69 instance-level attributes for 8 classes of complex social group activity videos. -
Animals with Attributes (AwA)
Zero-shot learning (ZSL) aims to classify objects that are not observed or seen during training. It relies on class semantic description to transfer knowledge from the seen... -
Towards Zero-Shot Frame Semantic Parsing for Domain Scaling
A dataset for zero-shot frame semantic parsing for domain scaling. -
Animal with Attributes 2
Zero-shot learning dataset for image recognition -
AWA1 and AWA2
The AWA1 and AWA2 datasets are used for zero-shot learning tasks. -
OTTER: Improving Zero-Shot Classification via Optimal Transport
Zero-shot models suffer due to artifacts inherited from pretraining. A particularly detrimental artifact, caused by unbalanced web-scale pretraining data, is mismatched label... -
SUN Attribute
The dataset used in the paper is SUN Attribute, which consists of 717 classes of images with annotations. -
Random Word Data Augmentation for Zero-Shot Anomaly Detection
This paper presents a novel method that leverages a visual-language model, CLIP, as a data source for zero-shot anomaly detection. -
Zero-Shot Automatic Pronunciation Assessment
Automatic Pronunciation Assessment (APA) is vital for computer-assisted language learning. Prior methods rely on annotated speech-text data to train Automatic Speech Recognition... -
Zero-1-to-3: Zero-shot one image to 3D object
Zero-shot one image to 3D object. -
Zero-Shot Temporal Action Detection via Vision-Language Prompting
Zero-Shot Temporal Action Detection via Vision-Language Prompting (STALE) model for the under-studied yet practically useful zero-shot temporal action detection (ZS-TAD) -
25 public datasets
The dataset used for evaluation of the MS-CLIP model, which consists of 25 public datasets for zero-shot learning and linear probing. -
Class Representative Learning Model
The CRL model is based on class-level classifiers, built class-by-class, that would be a representative of instances of a specific class by utilizing activation features of... -
Google Open Images
Google Open Images dataset, which contains 19,958 categories and is used for zero-shot learning. -
Vision-by-Language for Training-Free Compositional Image Retrieval
Compositional Image Retrieval through Vision-by-Language (CIReVL) is a training-free approach for Zero-Shot Compositional Image Retrieval (CIR). Utilizing off-the-shelf... -
ODIN: On-demand Data Formulation to Mitigate Dataset Lock-in
ODIN is an innovative approach that addresses the problem of dataset constraints by integrating generative AI models.