This paper presents a new taxonomy of sublinear (multiple) keyword pattern matching algorithms. Based on an earlier taxonomy by Watson and Zwaan [WZ96, WZ95], this new taxonomy includes not only suffix-based algorithms related to the Boyer-Moore, Commentz-Walter and Fan-Su algorithms, but factor- and factor oracle-based algorithms such as Backward DAWG Matching and Backward Oracle Matching as well. In particular, we show how suffix-based (Commentz-Walter like), factor- and factor oracle-based sublinear keyword pattern matching algorithms can all be seen as instantiations of a general sublinear algorithm skeleton. In addition, we show all shift functions defined for the suffix-based algorithms to be in principle reusable for factor- and factor oracle-based algorithms. The taxonomy is based on deriving the algorithms from a common starting point by adding algorithm and problem details, in order to arrive at efficient or well-known algorithms. Such a presentation provides correctness arguments for the algorithms as well as clarity on how the algorithms are related to one another. In addition, it is helpful in the construction of a toolkit of the algorithms.
|Name||Computer science reports|