Relational clustering models for knowledge discovery and recommender systems
Li, Tao, 1977- (2010) Relational clustering models for knowledge discovery and recommender systems. PhD thesis, University of Warwick.
WRAP_THESIS_Li_2010.pdf - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Official URL: http://webcat.warwick.ac.uk/record=b2339661~S15
Cluster analysis is a fundamental research field in Knowledge Discovery and Data Mining (KDD). It aims at partitioning a given dataset into some homogeneous clusters so as to reflect the natural hidden data structure. Various heuristic or statistical approaches have been developed for analyzing propositional datasets. Nevertheless, in relational clustering the existence of multi-type relationships will greatly degrade the performance of traditional clustering algorithms. This issue motivates us to find more effective algorithms to conduct the cluster analysis upon relational datasets. In this thesis we comprehensively study the idea of Representative Objects for approximating data distribution and then design a multi-phase clustering framework for analyzing relational datasets with high effectiveness and efficiency. The second task considered in this thesis is to provide some better data models for people as well as machines to browse and navigate a dataset. The hierarchical taxonomy is widely used for this purpose. Compared with manually created taxonomies, automatically derived ones are more appealing because of their low creation/maintenance cost and high scalability. Up to now, the taxonomy generation techniques are mainly used to organize document corpus. We investigate the possibility of utilizing them upon relational datasets and then propose some algorithmic improvements. Another non-trivial problem is how to assign suitable labels for the taxonomic nodes so as to credibly summarize the content of each node. Unfortunately, this field has not been investigated sufficiently to the best of our knowledge, and so we attempt to fill the gap by proposing some novel approaches. The final goal of our cluster analysis and taxonomy generation techniques is to improve the scalability of recommender systems that are developed to tackle the problem of information overload. Recent research in recommender systems integrates the exploitation of domain knowledge to improve the recommendation quality, which however reduces the scalability of the whole system at the same time. We address this issue by applying the automatically derived taxonomy to preserve the pair-wise similarities between items, and then modeling the user visits by another hierarchical structure. Experimental results show that the computational complexity of the recommendation procedure can be greatly reduced and thus the system scalability be improved.
|Item Type:||Thesis or Dissertation (PhD)|
|Subjects:||Q Science > QA Mathematics|
|Library of Congress Subject Headings (LCSH):||Cluster analysis, Information storage and retrieval systems, Recommender systems (Information filtering)|
|Institution:||University of Warwick|
|Theses Department:||Department of Computer Science|
|Supervisor(s)/Advisor:||Anand, Sarabjot S. ; Griffiths, Nathan|
|Sponsors:||University of Warwick ; University of Warwick. Dept. of Computer Science|
|Extent:||xiii, 170 leaves : ill., charts|
Actions (login required)