-
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algori...
Six-bit quantization can effectively reduce the size of large language models and preserve the model quality consistently across varied applications. -
Fairness Certification for Natural Language Processing and Large Language Models
The dataset used in the paper is a large corpus of text data, which is used to train and evaluate natural language processing models. -
Integer or floating point? new outlooks for low-bit quantization on large lan...
The dataset used in the paper is not explicitly described, but it is mentioned that it is a large language model dataset. -
A comprehensive study on post-training quantization for large language models
The ZeroQuant dataset is a large language model dataset used in the paper. -
Opt: Open pre-trained transformer language models
The OPT dataset is a large language model dataset used in the paper. -
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Fl...
The dataset used in the paper is not explicitly described, but it is mentioned that it is a large language model dataset. -
WebWISE: Web Interface Control and Sequential Exploration with Large Language...
The paper investigates using a Large Language Model (LLM) to automatically perform web software tasks using click, scroll, and text input operations. -
Modality-Aware Integration with Large Language Models for Knowledge-based Vis...
Knowledge-based visual question answering (KVQA) has been extensively studied to answer visual questions with external knowledge, e.g., knowledge graphs (KGs). -
Knowledge Graph-Enhanced Large Language Models via Path Selection
Two datasets, MetaQA and FACTKG, are used to evaluate the effectiveness of the proposed method KELP. MetaQA is a critical benchmark dataset containing subsets of questions with... -
LiveCodeBench
LiveCodeBench is a benchmark for evaluating the performance of Large Language Models (LLMs) in code editing tasks, including debugging, translating, polishing, and requirement... -
Towards Expert-Level Medical Question Answering with Large Language Models
The Towards Expert-Level Medical Question Answering with Large Language Models dataset contains a large-scale dataset for medical question answering using large language models. -
LLaVA-Instruct-150k
Visual question answering dataset -
MACHIAVELLI Benchmark
A dataset of traces from the MACHIAVELLI environment, including API calls and their outcomes. -
BELLS: A Framework Towards Future Proof Benchmarks for the Evaluation of LLM ...
A structured collection of tests for input-output safeguards, including established failure tests, emerging failure tests, and next-gen architecture tests.