When and why Vision-Language Models behave like Bags-of-Words, and what to do about it?

doi:doi:10.57702/pxdwqv15

When and why Vision-Language Models behave like Bags-of-Words, and what to do about it?

Followers: 0

Organization

No Organization

There is no description for this organization

License

No License Provided

Export

DCAT(rdf/xml) DCAT(xml) DCAT(N3) DCAT(ttl) DCAT(jsonld) DataCite CSL DublinCore BibTex

When and why Vision-Language Models behave like Bags-of-Words, and what to do about it?

Data and Resources

Original MetadataJSON
The json representation of the dataset with its distributions based on DCAT.
Explore
- Preview
- Download

Cite this as

Mert Yuksekgonul, Federico Bianchi, Pratyusha Kalluri, Dan Jurafsky, James Zou (2024). Dataset: When and why Vision-Language Models behave like Bags-of-Words, and what to do about it?. https://doi.org/10.57702/pxdwqv15

DOI retrieved: December 2, 2024

Additional Info

Field	Value
Created	December 2, 2024
Last update	December 2, 2024
Author	Mert Yuksekgonul
More Authors	Federico Bianchi Pratyusha Kalluri Dan Jurafsky James Zou
Homepage	https://arxiv.org/abs/2302.12043

Before browse our site, please accept our cookies policy