
The Library
Continuous sampling from distributed streams
Tools
Cormode, Graham, Muthukrishnan, S., Yi, Ke and Zhang, Qin (2012) Continuous sampling from distributed streams. Journal of the ACM, 59 (2). Article number 10. doi:10.1145/2160158.2160163 ISSN 0004-5411.
Research output not available from this repository.
Request-a-Copy directly from author or use local Library Get it For Me service.
Official URL: http://dx.doi.org/10.1145/2160158.2160163
Abstract
A fundamental problem in data management is to draw and maintain a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streaming data sets, this problem becomes particularly difficult when the data is shared across multiple distributed sites. The main challenge is to ensure that a sample is drawn uniformly across the union of the data while minimizing the communication needed to run the protocol on the evolving data. At the same time, it is also necessary to make the protocol lightweight, by keeping the space and time costs low for each participant. In this article, we present communication-efficient protocols for continuously maintaining a sample (both with and without replacement) from k distributed streams. These apply to the case when we want a sample from the full streams, and to the sliding window cases of only the W most recent elements, or arrivals within the last w time units. We show that our protocols are optimal (up to logarithmic factors), not just in terms of the communication used, but also the time and space costs for each participant.
Item Type: | Journal Article | ||||
---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics | ||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||
Library of Congress Subject Headings (LCSH): | Sampling (Statistics) | ||||
Journal or Publication Title: | Journal of the ACM | ||||
Publisher: | ACM Special Interest Group | ||||
ISSN: | 0004-5411 | ||||
Official Date: | April 2012 | ||||
Dates: |
|
||||
Volume: | 59 | ||||
Number: | 2 | ||||
Number of Pages: | 25 | ||||
Article Number: | Article number 10 | ||||
DOI: | 10.1145/2160158.2160163 | ||||
Status: | Peer Reviewed | ||||
Publication Status: | Published | ||||
Access rights to Published version: | Restricted or Subscription Access | ||||
Funder: | Hong Kong University of Science and Technology‏ (HKUST), Google (Firm), National Science Foundation (U.S.) (NSF) | ||||
Grant number: | 0414852 (NSF) |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |