A new taxonomy of sublinear keyword pattern matching algorithms

Research output: Book/ReportReportAcademic

86 Downloads (Pure)

Abstract

Abstract This paper presents a new taxonomy of sublinear (multiple) keyword pattern matching algorithms. Based on an earlier taxonomy by Watson and Zwaan [WZ96, WZ95], this new taxonomy includes not only suffix-based algorithms related to the Boyer-Moore, Commentz-Walter and Fan-Su algorithms, but factor- and factor oracle-based algorithms such as Backward DAWG Matching and Backward Oracle Matching as well. In particular, we show how suffix-based (Commentz-Walter like), factor- and factor oracle-based sublinear keyword pattern matching algorithms can all be seen as instantiations of a general sublinear algorithm skeleton. In addition, we show all shift functions defined for the suffix-based algorithms to be in principle reusable for factor- and factor oracle-based algorithms. The taxonomy is based on deriving the algorithms from a common starting point by adding algorithm and problem details, in order to arrive at efficient or well-known algorithms. Such a presentation provides correctness arguments for the algorithms as well as clarity on how the algorithms are related to one another. In addition, it is helpful in the construction of a toolkit of the algorithms.
Original languageEnglish
Place of PublicationEindhoven
PublisherTechnische Universiteit Eindhoven
Number of pages35
Publication statusPublished - 2004

Publication series

NameComputer science reports
Volume0407
ISSN (Print)0926-4515

Fingerprint Dive into the research topics of 'A new taxonomy of sublinear keyword pattern matching algorithms'. Together they form a unique fingerprint.

Cite this