The Library
Towards detecting patterns in failure logs of large-scale distributed systems
Tools
Gurumdimma, Nentawe, Jhumka, Arshad, Liakata, Maria, Chuah, Edward and Browne, James (2015) Towards detecting patterns in failure logs of large-scale distributed systems. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW), Hyderabad, 25-29 May 2015 pp. 1052-1061. doi:10.1109/IPDPSW.2015.109
Research output not available from this repository.
Request-a-Copy directly from author or use local Library Get it For Me service.
Official URL: http://dx.doi.org/10.1109/IPDPSW.2015.109
Abstract
The ability to automatically detect faults or fault patterns to enhance system reliability is important for system administrators in reducing system failures. To achieve this objective, the message logs from cluster system are augmented with failure information, i.e., The raw log data is labelled. However, tagging or labelling of raw log data is very costly. In this paper, our objective is to detect failure patterns in the message logs using unlabelled data. To achieve our aim, we propose a methodology whereby a pre-processing step is first performed where redundant data is removed. A clustering algorithm is then executed on the resulting logs, and we further developed an unsupervised algorithm to detect failure patterns in the clustered log by harnessing the characteristics of these sequences. We evaluated our methodology on large production data, and results shows that, on average, an f-measure of 78% can be obtained without having data labels. The implication of our methodology is that a system administrator with little knowledge of the system can detect failure runs with reasonably high accuracy.
Item Type: | Conference Item (Paper) | ||||||
---|---|---|---|---|---|---|---|
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||||
Publisher: | IEEE | ||||||
Book Title: | 2015 IEEE International Parallel and Distributed Processing Symposium Workshop | ||||||
Official Date: | 2015 | ||||||
Dates: |
|
||||||
Page Range: | pp. 1052-1061 | ||||||
DOI: | 10.1109/IPDPSW.2015.109 | ||||||
Status: | Peer Reviewed | ||||||
Publication Status: | Published | ||||||
Conference Paper Type: | Paper | ||||||
Title of Event: | 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW) | ||||||
Type of Event: | Workshop | ||||||
Location of Event: | Hyderabad | ||||||
Date(s) of Event: | 25-29 May 2015 |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |