-
Unstructured Social Activity Attribute (USAA)
A video dataset of 69 instance-level attributes for 8 classes of complex social group activity videos. -
Animals with Attributes (AwA)
Zero-shot learning (ZSL) aims to classify objects that are not observed or seen during training. It relies on class semantic description to transfer knowledge from the seen... -
Towards Zero-Shot Frame Semantic Parsing for Domain Scaling
A dataset for zero-shot frame semantic parsing for domain scaling. -
Animal with Attributes 2
Zero-shot learning dataset for image recognition -
Z-icl: Zero-shot in-context learning with pseudo-demonstrations
This survey concentrates on few-shots In-Context Learning (ICL) using retrieved examples for large language models, a key aspect of Retrieval-Augmented Generation (RAG). -
Synthesis Step by Step (S3)
Data Synthesis is a promising way to train a small model with very little labeled data. One approach for data synthesis is to leverage the rich knowledge from large language... -
Donut: Hierarchical EMD-Space Planning for Zero-Shot Deformable Manipulation ...
The dataset used in the paper is a simulated dough manipulation environment, where the goal is to create a donut, a baguette, and two pancakes using a set of candidate tools. -
On The Ingredients of an Effective Zero-shot Semantic Parser
Semantic parsers map natural language utterances into meaning representations (e.g. programs). Such models are typically bottle-necked by the paucity of training data due to the... -
UT-Zappos50K
The UT-Zappos50K dataset is a fine-grained shoe catalog, characterized by its smaller scale and relatively stable and simple content. -
Finetuned language models are zero-shot learners
Finetuned language models are zero-shot learners -
Zero-1-to-3: Zero-shot one image to 3D object
Zero-shot one image to 3D object. -
InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction
Text-conditioned human motion generation has experienced significant advancements with diffusion models trained on extensive motion capture data and corresponding textual... -
25 public datasets
The dataset used for evaluation of the MS-CLIP model, which consists of 25 public datasets for zero-shot learning and linear probing. -
MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation
Semantic segmentation performs pixel-level classifica- tion to localize objects from different classes in the input image. Open-vocabulary semantic segmentation aims to... -
Google Open Images
Google Open Images dataset, which contains 19,958 categories and is used for zero-shot learning. -
Zero-shot video question answering via frozen bidirectional language models
Zero-shot video question answering via frozen bidirectional language models. -
HMDB51 and UCF101
The dataset used in the paper is HMDB51 and UCF101. -
Kinetics-400 and Something-Something-V2
The dataset used in the paper is Kinetics-400 and Something-Something-V2. -
Language-free Training for Zero-shot Video Grounding
Given an untrimmed video and a language query, video grounding aims to localize the time interval by understanding the text and video simultaneously.