Devil in the Number: Towards Robust Multi-modality Data Filter

doi:doi:10.57702/wrxz766y

Devil in the Number: Towards Robust Multi-modality Data Filter

Followers: 0

Organization

No Organization

There is no description for this organization

License

No License Provided

Export

DCAT(rdf/xml) DCAT(xml) DCAT(N3) DCAT(ttl) DCAT(jsonld) DataCite CSL DublinCore BibTex

Devil in the Number: Towards Robust Multi-modality Data Filter

The dataset used in the paper is a web-scale dataset for training a vision-language model. The dataset contains text-image pairs, and the authors propose a novel filter to remove redundant information such as numbers and bracketed content.

BibTex:

Before browse our site, please accept our cookies policy