The Library
Topics in compositional data analysis
Tools
Ali, Zeeshan (2022) Topics in compositional data analysis. PhD thesis, University of Warwick.
PDF
WRAP_Theses_Ali_2022.pdf - Submitted Version Embargoed item. Restricted access to Repository staff only until 25 May 2025. Contact author directly, specifying your specific needs. - Requires a PDF viewer. Download (2264Kb) |
Official URL: http://webcat.warwick.ac.uk/record=b3940918
Abstract
Compositional data refers to vectors with nonnegative elements whose sum is one and convey relative information. The most dominant statistical methodology for analysing compositional data is the logratio approach developed by John Aitchison in the 1980s. Although this approach has been remarkably successful due mainly to its mathematical underpinning, it is problematic in certain applications, such as psephology (i.e., the statistical study of election and voting pattern), in one way or another, with issues including the difficult interpretations and the widely encountered problem of zero-valued data. The main contribution of the work presented in this thesis is highlighting these problems of compositional data and proposing novel methods to handle them in the context of psephology. In particular, the methodological focus of the thesis is comparing and clustering compositions and their trajectories based on the logratio methods and the proposed methods; the applied focus of the thesis is the voting data of England for general elections 2010, 2015 and 2017.
Specifically, this thesis proposes a new dissimilarity measure for compositional data (Chapter 2), a set of criteria that a sensible scalar measure of difference (i.e., distance or dissimilarity measure) for compositional data should satisfy (Chapter 3) and a new transformation that can be used for compositions that have some parts at zero or near zero (Chapter 4). It is argued in Chapter 2 that the distance or dissimilarity measures that do not satisfy the properties of subcompositional dominance and perturbation invariance (e.g., the proposed dissimilarity measure) are more sensible distance measures for compositional voting data than those that meet those properties (e.g., well-known Aitchison distance). Moreover, it is also shown that most criteria do not hold for Aitchison distance, proposed in Chapter 3. In Chapter 4, it is also argued that the geometry of the proposed transformation is more relevant to the geometry of the ternary diagrams (or simplex, in general) than the geometry of widely used additive logratio and centred logratio transformations.
Item Type: | Thesis (PhD) | ||||
---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics | ||||
Library of Congress Subject Headings (LCSH): | Vector analysis, Correlation (Statistics), Multivariate analysis, Mathematical statistics, Data analysis | ||||
Official Date: | September 2022 | ||||
Dates: |
|
||||
Institution: | University of Warwick | ||||
Theses Department: | Department of Statistics | ||||
Thesis Type: | PhD | ||||
Publication Status: | Unpublished | ||||
Supervisor(s)/Advisor: | Firth, David (Professor of statistics) ; Berrett, Thomas Benjamin | ||||
Sponsors: | University of Warwick. Chancellor's International Scholarship | ||||
Format of File: | |||||
Extent: | xvi, 17-156 pages : illustrations | ||||
Language: | eng |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |