The Library
Performance and scalability of indexed subgraph query processing methods
Tools
Katsarou, F., Ntarmos, N. and Triantafillou, Peter (2015) Performance and scalability of indexed subgraph query processing methods. Proceedings of the VLDB Endowment, 8 (12). pp. 1566-1577. doi:10.14778/2824032.2824054 ISSN 2150-8097.
Research output not available from this repository.
Request-a-Copy directly from author or use local Library Get it For Me service.
Official URL: http://doi.org/10.14778/2824032.2824054
Abstract
Graph data management systems have become very popular as graphs are the natural data model for many applications. One of the main problems addressed by these systems is subgraph query processing; i.e., given a query graph, return all graphs that contain the query. The naive method for processing such queries is to perform a subgraph isomorphism test against each graph in the dataset. This obviously does not scale, as subgraph isomorphism is NP-Complete. Thus, many indexing methods have been proposed to reduce the number of candidate graphs that have to underpass the subgraph isomorphism test. In this paper, we identify a set of key factors-parameters, that in uence the performance of related methods: namely, the number of nodes per graph, the graph density, the number of distinct labels, the number of graphs in the dataset, and the query graph size. We then conduct comprehensive and systematic experiments that analyze the sensitivity of the various methods on the values of the key parameters. Our aims are twofold: first to derive conclusions about the algorithms' relative performance, and, second, to stress-test all algorithms, deriving insights as to their scalability, and highlight how both performance and scalability depend on the above factors. We choose six wellestablished indexing methods, namely Grapes, CT-Index, GraphGrepSX, gIndex, Tree+Δ, and gCode, as representative approaches of the overall design space, including the most recent and best performing methods. We report on their index construction time and index size, and on query processing performance in terms of time and false positive ratio. We employ both real and synthetic datasets. Specifically, four real datasets of different characteristics are used: AIDS, PDBS, PCM, and PPI. In addition, we generate a large number of synthetic graph datasets, empowering us to systematically study the algorithms' performance and scalability versus the aforementioned key parameters. © 2015 VLDB Endowment 2150-8097/15/08.
Item Type: | Journal Article | ||||
---|---|---|---|---|---|
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||
Journal or Publication Title: | Proceedings of the VLDB Endowment | ||||
Publisher: | ACM | ||||
ISSN: | 2150-8097 | ||||
Official Date: | August 2015 | ||||
Dates: |
|
||||
Volume: | 8 | ||||
Number: | 12 | ||||
Page Range: | pp. 1566-1577 | ||||
DOI: | 10.14778/2824032.2824054 | ||||
Status: | Peer Reviewed | ||||
Publication Status: | Published | ||||
Re-use Statement: | cited By 6 | ||||
Access rights to Published version: | Restricted or Subscription Access | ||||
Conference Paper Type: | Paper | ||||
Title of Event: | 41st International Conference on Very Large Data Bases | ||||
Type of Event: | Conference | ||||
Location of Event: | Kohala Coast, Hawaii | ||||
Date(s) of Event: | Aug 2015 |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |