You're currently viewing an old version of this dataset. To see the current version, click here.

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Six-bit quantization can effectively reduce the size of large language models and preserve the model quality consistently across varied applications.

Data and Resources

This dataset has no data

Cite this as

Haojun Xia, Zhen Zheng, Xiaoxia Wu, Shiyang Chen, Zhewei Yao, Stephen Youn, Arash Bakhtiari, Michael Wyatt, Yuxiong He, Olatunji Ruwase, Shuaiwen Leon Song (2025). Dataset: FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design. https://doi.org/10.57702/z4ohr5qr

Private DOI This DOI is not yet resolvable.
It is available for use in manuscripts, and will be published when the Dataset is made public.

Additional Info

Field	Value
Created	January 3, 2025
Last update	January 3, 2025
Defined In	https://doi.org/10.48550/arXiv.2401.14112
Author	Haojun Xia
More Authors	Zhen Zheng Xiaoxia Wu Shiyang Chen Zhewei Yao Stephen Youn Arash Bakhtiari Michael Wyatt Yuxiong He Olatunji Ruwase Shuaiwen Leon Song
Homepage	https://github.com/usyd-fsalab/fp6_llm