Generating and sharing differentially private spatio-temporal data using real-world knowledge

[thumbnail of WRAP_THESIS_Cunningham_2022.pdf]
Preview
PDF
WRAP_THESIS_Cunningham_2022.pdf - Submitted Version - Requires a PDF viewer.

Download (44MB) | Preview

Request Changes to record.

Abstract

Privacy-preserving spatio-temporal data sharing is vital for addressing many real-world problems, such as managing disease spread or tailoring public services to a population’s travel patterns. Differential privacy has become the de facto privacy standard owing to its strong privacy guarantees, although existing mechanisms make very restrictive assumptions regarding what outside knowledge is known beyond the data itself. .is limits the practical utility of the private data, and has prevented the widespread deployment of differentially private algorithms in the real world. .

This thesis aims to show that incorporating publicly available information, such as the road network or characteristics of places of interests, can enhance the practical utility of the output data without negatively affecting privacy. .This thesis focuses on two main problems, both of which are fundamental in enabling location analytics with private data. The first considers the synthesis of spatial point data, and three solutions are proposed. The first solution uses a private adaptation of kernel density estimation to generate data within small private partitions, and the second uses the road network as the basis for data generation. The third solution combines randomised response with generative adversarial networks to develop a generative model that satisfies label local differential privacy – a more practical and realistic privacy setting. The second problem focuses on sharing trajectory data using local differential privacy. .e proposed solution uses the exponential mechanism to efficiently perturb overlapping, hierarchically structured =-grams of trajectory data, which help to preserve the spatio-temporal correlations inherent in trajectory data. .is problem, and its solution, is then extended to a setting in which two services wish to privately share event sequence data with each other.

All solutions incorporate publicly available external knowledge by imposing hard constraints on feasible outputs, exploiting the intrinsic hierarchies and underlying structures of realworld data, and using distance functions to ensure that semantically similar values are more likely to be output. Experiments with real data show that including this information helps to produce private data that performs very well in many spatio-temporal analytical tasks, including range, hotspot, and facility location queries. These strong results demonstrate the potential for more widespread use of differential privacy in the real world.

Item Type: Thesis [via Doctoral College] (PhD)
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Library of Congress Subject Headings (LCSH): Privacy, Right of, Electronic data processing, Location-based services, Big data, Data protection, Data mining, Spatial data mining, Geospatial data -- Computer processing
Official Date: July 2022
Dates:
Date
Event
July 2022
UNSPECIFIED
Institution: University of Warwick
Theses Department: Department of Computer Science
Thesis Type: PhD
Publication Status: Unpublished
Supervisor(s)/Advisor: Ferhatosmanoglu, Hakan
Sponsors: Engineering and Physical Sciences Research Council ; European Research Council ; Alan Turing Institute
Format of File: pdf
Extent: xiii, 149 pages : colour illustrations
Language: eng
URI: https://wrap.warwick.ac.uk/173604/

Export / Share Citation


Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item