Sharp Frequency Bounds for Sample-Based Queries
The dataset used in this paper is a big data set, and the authors use a data sketch algorithm to statistically infer probably approximately correct (PAC) bounds for frequencies of items that meet various criteria in the big data set.
BibTex: