Synthetic Workload for LLM Serving

The dataset used in the paper is a synthetic workload, where clients send requests with different input and output lengths, and with varying request rates.

BibTex: