The Library
Performance and scalability of indexed subgraph query processing methods
Tools
Triantafillou, Peter (2016) Performance and scalability of indexed subgraph query processing methods. In: ALGOCLOUD 2015, Greece., 14–15 Sep 2016. Published in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9511
Research output not available from this repository.
Request-a-Copy directly from author or use local Library Get it For Me service.
Official URL: https://link.springer.com/content/pdf/bfm%3A978-3-...
Abstract
Graph data management systems have become very popular, as graphs are the natural data model for many applications. One of the main problems addressed by these systems is subgraph query processing; i.e., given a query graph, return all graphs that contain the query. The naive method for processing such queries is to perform a subgraph isomorphism test against each graph in the dataset. This obviously does not scale, as subgraph isomorphism is NP-complete. Thus, many indexing methods have been proposed to reduce the number of candidate graphs that have to underpass the subgraph isomorphism test. In this tutorial, we identify a set of key factors–parameters that influence the performance of related methods: namely, the number of nodes per graph, the graph density, the number of distinct labels, the number of graphs in the dataset, and the query graph size. We then discuss comprehensive and systematic experiments that analyze the sensitivity of the various methods on the values of the key parameters. Our aims are twofold: first to draw conclusions about the relative performance of the algorithms, and, second, to stress-test all algorithms, deriving insights as to their scalability and highlighting how both performance and scalability depend on these factors. We present six well-established indexing methods, namely, Grapes, CT-Index, GraphGrepSX, gIndex, Tree+, and gCode, as representative approaches of the overall design space, including the most recent and best-performing methods. We report on their index construction time and index size as well as on query processing performance in terms of time and false-positive ratio. We discuss performance on both real and synthetic datasets. Specifically, four real datasets of different characteristics are used: AIDS, PDBS, PCM, and PPI. In addition, we report on a large number of synthetic graph datasets, empowering us to systematically study the performance and scalability of the algorithms as they depend on the aforementioned key parameters. © Springer International Publishing Switzerland 2016.
Item Type: | Conference Item (Paper) | ||||
---|---|---|---|---|---|
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||
Series Name: | Lecture Notes in Computer Science | ||||
Journal or Publication Title: | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | ||||
Publisher: | Springer | ||||
Book Title: | Algorithmic Aspects of Cloud Computing | ||||
Official Date: | September 2016 | ||||
Dates: |
|
||||
Volume: | 9511 | ||||
Status: | Peer Reviewed | ||||
Publication Status: | Published | ||||
Re-use Statement: | cited By 0 | ||||
Conference Paper Type: | Paper | ||||
Title of Event: | ALGOCLOUD 2015 | ||||
Type of Event: | Conference | ||||
Location of Event: | Greece. | ||||
Date(s) of Event: | 14–15 Sep 2016 | ||||
Related URLs: |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |