-
Voice Aging with Audio-Visual Style Transfer
Face aging techniques have used generative adversarial networks (GANs) and style transfer learning to transform one’s appearance to look younger/older. Identity is maintained by... -
Visually Indicated Sounds
A dataset of audio-visual pairs where the audio is visually indicated. -
Vggsound: A large-scale audio-visual dataset
A large-scale audio-visual dataset containing audio-visual pairs. -
LISA: Localized Image Stylization with Audio
A novel framework for audio-guided local image stylization, named LISA. Audio-visual sound source localizer provides a delicate localization map by leveraging the CLIP embedding...