The Library
Reducing spatio-temporal data : methods and analysis
Tools
Steadman, Liam (2020) Reducing spatio-temporal data : methods and analysis. PhD thesis, University of Warwick.
|
PDF
WRAP_Theses_Steadman_2020.pdf - Submitted Version - Requires a PDF viewer. Download (18Mb) | Preview |
Official URL: http://webcat.warwick.ac.uk/record=b3714354~S15
Abstract
Analysing and learning from spatio-temporal datasets is an important process in many domains, including transportation, healthcare and meteorology. However, in recent years, the volume of data generated for such datasets has increased significantly. This poses several challenges for data scientists, including increased processing overheads and costs. Thus, several methods have been proposed for reducing the volume of data stored and processed to analyse and learn from these datasets. However, existing methods fail to take advantage of the spatial and temporal autocorrelation present in spatio-temporal data, incur unnecessary overheads when retrieving the data, or fail to retain information about all instances and features.
This thesis introduces several data reduction methods to address these limitations. First, the kD-STR algorithm is introduced, which hierarchically partitions and models the data, thereby reducing the storage overhead of the dataset. This method minimises the storage used and error incurred. Second, this reduction method is adapted for the context of data linking, and an alternative heuristic proposed that minimises error in the features engineered during linking. Third, adapted algorithms are presented for reducing multiple datasets simultaneously, and reducing large datasets in a distributed manner.
Through empirical analysis using real-world datasets, the utility of these algorithms is investigated. The results presented demonstrate the data reduction that can be achieved using these algorithms, as well as the impact of using different spatial referencing systems and modelling techniques. Further analysis is presented that demonstrates the effect of error in location and time, noise and missing data on the data reduction. Combined, the algorithms presented offer an improvement over the state-of-the-art in spatio-temporal data reduction, and the analysis presented demonstrates the results that may be achieved for datasets exhibiting a range of characteristics.
Item Type: | Thesis (PhD) | ||||
---|---|---|---|---|---|
Subjects: | H Social Sciences > HE Transportation and Communications Q Science > QA Mathematics Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software |
||||
Library of Congress Subject Headings (LCSH): | Data reduction, Big data -- Data processing, Big data -- Management, Big data -- Mathematics, Transportation -- Technological innovations -- Great Britain, Electronic data processing | ||||
Official Date: | December 2020 | ||||
Dates: |
|
||||
Institution: | University of Warwick | ||||
Theses Department: | Department of Computer Science | ||||
Thesis Type: | PhD | ||||
Publication Status: | Unpublished | ||||
Supervisor(s)/Advisor: | Griffiths, Nathan ; Jarvis, Stephen A., 1970- | ||||
Sponsors: | Engineering and Physical Sciences Research Council ; Transport Research Laboratory (Great Britain) | ||||
Format of File: | |||||
Extent: | xvi, 230 leaves : colour illustrations, colour maps | ||||
Language: | eng |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year