The Library

Performance and scalability of indexed subgraph query processing methods

Tools

Triantafillou, Peter (2016) Performance and scalability of indexed subgraph query processing methods. In: ALGOCLOUD 2015, Greece., 14–15 Sep 2016. Published in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9511

Research output not available from this repository.

Request-a-Copy directly from author or use local Library Get it For Me service.

Official URL: https://link.springer.com/content/pdf/bfm%3A978-3-...

Request Changes to record.

Abstract

Graph data management systems have become very popular, as graphs are the natural data model for many applications. One of the main problems addressed by these systems is subgraph query processing; i.e., given a query graph, return all graphs that contain the query. The naive method for processing such queries is to perform a subgraph isomorphism test against each graph in the dataset. This obviously does not scale, as subgraph isomorphism is NP-complete. Thus, many indexing methods have been proposed to reduce the number of candidate graphs that have to underpass the subgraph isomorphism test. In this tutorial, we identify a set of key factors–parameters that influence the performance of related methods: namely, the number of nodes per graph, the graph density, the number of distinct labels, the number of graphs in the dataset, and the query graph size. We then discuss comprehensive and systematic experiments that analyze the sensitivity of the various methods on the values of the key parameters. Our aims are twofold: first to draw conclusions about the relative performance of the algorithms, and, second, to stress-test all algorithms, deriving insights as to their scalability and highlighting how both performance and scalability depend on these factors. We present six well-established indexing methods, namely, Grapes, CT-Index, GraphGrepSX, gIndex, Tree+, and gCode, as representative approaches of the overall design space, including the most recent and best-performing methods. We report on their index construction time and index size as well as on query processing performance in terms of time and false-positive ratio. We discuss performance on both real and synthetic datasets. Specifically, four real datasets of different characteristics are used: AIDS, PDBS, PCM, and PPI. In addition, we report on a large number of synthetic graph datasets, empowering us to systematically study the performance and scalability of the algorithms as they depend on the aforementioned key parameters. © Springer International Publishing Switzerland 2016.

Item Type:

Conference Item (Paper)

Divisions:

Faculty of Science, Engineering and Medicine > Science > Computer Science

Series Name:

Lecture Notes in Computer Science

Journal or Publication Title:

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Publisher:

Springer

Book Title:

Algorithmic Aspects of Cloud Computing

Official Date:

September 2016

Dates:

Date	Event
September 2016	Published

Volume:

9511

Status:

Peer Reviewed

Publication Status:

Published

Re-use Statement:

cited By 0

Conference Paper Type:

Paper

Title of Event:

ALGOCLOUD 2015

Type of Event:

Conference

Location of Event:

Greece.

Date(s) of Event:

14–15 Sep 2016

Related URLs:

Other

Request changes or add full text files to a record

Repository staff actions (login required)

View Item

University of Warwick
Publications service & WRAP

Highlight your research

The Library

Performance and scalability of indexed subgraph query processing methods

Abstract

Repository staff actions (login required)

University of WarwickPublications service & WRAP

Highlight your research

The Library

Performance and scalability of indexed subgraph query processing methods

Abstract

Repository staff actions (login required)

University of Warwick
Publications service & WRAP