The Library
Aggregate profile clustering for streaming analytics
Tools
Abbasoğlu, Mehmet Alİ, Gedİk, Buğra and Ferhatosmanoglu, Hakan (2015) Aggregate profile clustering for streaming analytics. Computer Journal, 58 (9). pp. 2092-2108. doi:10.1093/comjnl/bxv023 ISSN 0010-4620.
Research output not available from this repository.
Request-a-Copy directly from author or use local Library Get it For Me service.
Official URL: http://dx.doi.org/10.1093/comjnl/bxv023
Abstract
Many analytic applications require analyzing user interaction data. In particular, such data can be aggregated over a window to build user activity profiles. Clustering such aggregate profiles is useful for grouping together users with similar behaviors, so that common models could be built for them. In this paper, we present an approach for clustering profiles that are incrementally maintained over a stream of updates. Owing to the potentially large number of users and high rate of interactions, maintaining profile clusters can have high processing and memory resource requirements. To tackle this problem, we apply distributed stream processing. However, in the presence of distributed state, it is a major challenge to partition the profiles over nodes such that memory and computation balance is maintained, while keeping the clustering accuracy high. Furthermore, in order to adapt to potentially changing user interaction patterns, the partitioning of profiles to nodes should be continuously revised, yet one should minimize the migration of profiles so as not to disturb the online processing of updates. We develop a re-partitioning technique that achieves all these goals. To achieve this, we keep micro-cluster summaries at each node and periodically collect these summaries at a central node to perform re-partitioning. We use a greedy algorithm with novel affinity heuristics to revise the partitioning and update the routing tables without introducing a lengthy pause. We showcase the effectiveness of our approach using an application that clusters customers of a telecommunications company based on their aggregate calling profiles.
Item Type: | Journal Article | ||||||
---|---|---|---|---|---|---|---|
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||||
Journal or Publication Title: | Computer Journal | ||||||
Publisher: | Oxford University Press | ||||||
ISSN: | 0010-4620 | ||||||
Official Date: | 1 September 2015 | ||||||
Dates: |
|
||||||
Volume: | 58 | ||||||
Number: | 9 | ||||||
Page Range: | pp. 2092-2108 | ||||||
DOI: | 10.1093/comjnl/bxv023 | ||||||
Status: | Peer Reviewed | ||||||
Publication Status: | Published | ||||||
Access rights to Published version: | Restricted or Subscription Access |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |