Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Data-independent space partitionings for summaries

Tools
- Tools
+ Tools

Cormode, Graham, Garofalakis, Minos and Shekelyan, Michael (2021) Data-independent space partitionings for summaries. In: The 2021 ACM SIGMOD/PODS Conference, Virtual conference, 20-25 Jun 2021. Published in: PODS'21: Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems pp. 285-298. ISBN 9781450383813. doi:10.1145/3452021.3458316

[img]
Preview
PDF
WRAP-Data-independent-space-partitionings-summaries-2021.pdf - Accepted Version - Requires a PDF viewer.

Download (1278Kb) | Preview
Official URL: https://doi.org/10.1145/3452021.3458316

Request Changes to record.

Abstract

Histograms are a standard tool in data management for describing multidimensional data. It is often convenient or even necessary to define data independent histograms, to partition space in advance without observing the data itself. Specific motivations arise in managing data when it is not suitable to frequently change the boundaries between histogram cells. For example, when the data is subject to many insertions and deletions; when data is distributed across multiple systems; or when producing a privacy-preserving representation of the data. The baseline approach is to consider an equiwidth histogram, i.e., a regular grid over the space. However, this is not optimal for the objective of splitting the multidimensional space into (possibly overlapping) bins, such that each box can be rebuilt using a set of non-overlapping bins with minimal excess (or deficit) of volume. Thus, we investigate how to split the space into bins and identify novel solutions that offer a good balance of desirable properties. As many data processing tools require a dataset as an input, we propose efficient methods how to obtain synthetic point sets that match the histograms over the overlapping bins.

Item Type: Conference Item (Paper)
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Divisions: Faculty of Science > Computer Science
Library of Congress Subject Headings (LCSH): Database management, Data mining, Querying (Computer science), Data structures (Computer science)
Journal or Publication Title: PODS'21: Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
Publisher: ACM
ISBN: 9781450383813
Official Date: 20 June 2021
Dates:
DateEvent
20 June 2021Published
2021Available
12 April 2021Accepted
Page Range: pp. 285-298
DOI: 10.1145/3452021.3458316
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Open Access
RIOXX Funder/Project Grant:
Project/Grant IDRIOXX Funder NameFunder ID
ERC-2014-CoG 647557European Research Councilhttp://dx.doi.org/10.13039/501100000781
Conference Paper Type: Paper
Title of Event: The 2021 ACM SIGMOD/PODS Conference
Type of Event: Conference
Location of Event: Virtual conference
Date(s) of Event: 20-25 Jun 2021
Related URLs:
  • Organisation

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us