Investigations on path indexing for graph databases

Jonathan M. Sumrall, George H.L. Fletcher, Alexandra Poulovassilis, Johan Svensson, Magnus Vejlstrup, Chris Vest, Jim Webber

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Graph databases have become an increasingly popular choice for the management of the massive network data sets arising in many contemporary applications. We investigate the effectiveness of path indexing for accelerating query processing in graph database systems, using as an exemplar the widely used open-source Neo4j graph database. We present a novel path index design which supports efficient ordered access to paths in a graph dataset. Our index is fully persistent and designed for external memory storage and retrieval. We also describe a compression scheme that exploits the limited differences between consecutive keys in the index, as well as a workload-driven approach to indexing. We demonstrate empirically the speed-ups achieved by our implementation, showing that the path index yields query run-times from 2x up to 8000x faster than Neo4j. Empirical evaluation also shows that our scheme leads to smaller indexes than using general-purpose LZ4 compression. The complete stand-alone implementation of our index, as well as supporting tooling such as a bulk-loader, are provided as open source for further research and development.

Original languageEnglish
Title of host publicationEuro-Par 2016
Subtitle of host publicationParallel Processing Workshops - Euro-Par 2016 International Workshops, Revised Selected Papers
EditorsPierre-Francois Dutot, Frederic Desprez
Place of PublicationCham
PublisherSpringer
Pages532-544
Number of pages13
ISBN (Electronic)978-3-319-58943-5
ISBN (Print)978-3-319-58942-8
DOIs
Publication statusPublished - 1 Jan 2017
Event22nd International Conference on Parallel and Distributed Computing, Euro-Par 2016 - Grenoble, France
Duration: 24 Aug 201626 Aug 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10104 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference22nd International Conference on Parallel and Distributed Computing, Euro-Par 2016
Country/TerritoryFrance
CityGrenoble
Period24/08/1626/08/16

Fingerprint

Dive into the research topics of 'Investigations on path indexing for graph databases'. Together they form a unique fingerprint.

Cite this