-
GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model
This paper tackles a novel problem: how to transfer knowledge from the emerging Segment Anything Model (SAM) to learn a compact panoramic semantic segmentation model, i.e.,... -
VCTK Dataset
The VCTK dataset is a large corpus of speech recordings, each containing a single speaker and a single sentence. -
LJSpeech Dataset
The LJSpeech dataset is a collection of audio recordings of a single female speaker reading aloud. -
FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech
FastSpeech 2 is a fast and high-quality end-to-end text-to-speech system. It uses a multi-task learning approach to learn the mapping between phonemes and waveforms. -
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
FastDiff is a fast conditional diffusion model for high-quality speech synthesis. It employs a stack of time-aware location-variable convolutions with diverse receptive field... -
ImageNet-Dogs
The dataset used in the paper for image classification, object detection, and face verification tasks. -
ImageNet-10
ImageNet-10 is a dataset of 10,000 224x224 color images in 10 classes, with 1,000 images per class. -
FULL1 and FULL2 datasets
The FULL1 and FULL2 datasets are subsets of the Oxford RobotCar dataset, with longer route lengths. -
LOOP1 and LOOP2 datasets
The LOOP1 and LOOP2 datasets are subsets of the Oxford RobotCar dataset, with shorter route lengths. -
Amazon Photos
The dataset used in the paper to evaluate the influence of graph elements on the parameter changes of GCNs without needing to retrain the GCNs. -
Amazon Computers
The dataset used in the paper to evaluate the influence of graph elements on the parameter changes of GCNs without needing to retrain the GCNs. -
Amazon
The dataset used in the paper is a series of datasets introduced in [46], comprising large corpora of product reviews crawled from Amazon.com. Top-level product categories on... -
MovieLens10M
The MovieLens10M dataset is a classic movie rating dataset, whose ratings range from 0.5 to 5. -
PhotoBot: Reference-Guided Interactive Photography via Natural Language
PhotoBot is a framework for fully automated photo acquisition based on an interplay between high-level human language guidance and a robot photographer. -
Symmetric Autoencoder for Redatuming Physical Systems
The dataset is used for redatuming physical systems using symmetric autoencoders. It contains seismic wave data from different states with varying nuisance parameters. -
Deterministic Policy Gradients With General State Transitions
The authors used the ComplexPoint-v0, Pendulum-v0, LunarLanderContinuous-v2, Swimmer-v2, HalfCheetah-v2, HumanoidStandup-v2, Humanoid-v2 datasets for experiments.