18 datasets found

Formats: JSON Tags: text

Filter Results
  • NeurIPS, AAN, NSF Abstracts

    NeurIPS, AAN, NSF Abstracts
  • The Pile

    The Pile dataset contains 3.5 million samples of diverse text for language modeling.
  • Event Location Dataset

    A dataset of around 8,000 labeled sentences in English, each of which is annotated with an event verb and its corresponding location or locations.
  • FLICKR-25K

    The dataset used for cross-modal hashing task, containing image and text data.
  • Wiki4

    The dataset used for cross-modal hashing task, containing image and text data.
  • MoMu

    The MoMu dataset is a molecular graph-text pairs dataset, constructed from scientific articles.
  • MoleculeNet

    The MoleculeNet dataset is a collection of molecular property prediction tasks. It contains 17 datasets, each with a different type of molecular graph.
  • Trafficking-10k

    The Trafficking-10k dataset contains more than 10,000 advertisements annotated for the task of detecting human trafficking. The dataset contains two sources of information per...
  • AudioCaps

    Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates given a query in another modality.
  • LSMDC

    The LSMDC movie description dataset consists of 118,081 short video clips extracted from 202 movies, each annotated with a single caption.
  • MSVD

    Text-Video Retrieval (TVR) aims to align relevant video content with natural language queries. To date, most state-of-the-art TVR methods learn image-to-video transfer learning...
  • ActivityNet Captions

    The ActivityNet Captions is a benchmark dataset proposed for dense video captioning. There are 20K untrimmed videos in total, and each video has several annotated segments with...
  • MSR-VTT

    The dataset used in the paper is MSR-VTT, a large video description dataset for bridging video and language. The dataset contains 10k video clips with length varying from 10 to...
  • MMVet Dataset

    The dataset used for testing the Vary-base model, containing MMVet dataset.
  • DocVQA and ChartQA Datasets

    The dataset used for testing the Vary-base model, containing DocVQA and ChartQA datasets.
  • Document-Level OCR Dataset

    The dataset used for testing the Vary-base model, containing document-level OCR test set.
  • Natural Image-Text Dataset

    The dataset used for training the Vary-base model, containing natural image-text pairs.
  • Document and Chart Dataset

    The dataset used for training the new vision vocabulary network, containing high-resolution document and chart images with corresponding text.
You can also access this registry using the API (see API Docs).