A new taxonomy of sublinear right-to-left scanning keyword pattern matching algorithms

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

13 Citaties (Scopus)

Uittreksel

A new taxonomy of sublinear (multiple) keyword pattern matching algorithms is presented. Based on an earlier taxonomy by the second and third author, this new taxonomy includes not only suffix-based algorithms, but also factor- and factor oracle-based algorithms. In particular, we show how suffix-based (Commentz-Walter like), factor- and factor oracle-based sublinear keyword pattern matching algorithms can be seen as instantiations of a general sublinear algorithm skeleton. During processing, such algorithms shift or jump through the text in a forward or left-to-right direction, and read backward or right-to-left starting from positions in the text, i.e. they read suffixes of certain prefixes of the text. They use finite automata for efficient computation of string membership in a certain language. In addition, we show shift functions defined for the suffix-based algorithms to be reusable for factor- and factor oracle-based algorithms. The taxonomy is based on deriving the algorithms from a common starting point by adding algorithm and problem details, to arrive at efficient or well-known algorithms. Such a presentation provides correctness arguments for the algorithms as well as clarity on how the algorithms are related to one another. In addition, it is helpful in the construction of a toolkit of the algorithms.
TaalEngels
Pagina's1095-1112
TijdschriftScience of Computer Programming
Volume75
Nummer van het tijdschrift11
DOI's
StatusGepubliceerd - 2010

Vingerafdruk

Pattern matching
Taxonomies
Scanning
Finite automata

Citeer dit

@article{13d7610f00d44e6282d77531bd7b8fc6,
title = "A new taxonomy of sublinear right-to-left scanning keyword pattern matching algorithms",
abstract = "A new taxonomy of sublinear (multiple) keyword pattern matching algorithms is presented. Based on an earlier taxonomy by the second and third author, this new taxonomy includes not only suffix-based algorithms, but also factor- and factor oracle-based algorithms. In particular, we show how suffix-based (Commentz-Walter like), factor- and factor oracle-based sublinear keyword pattern matching algorithms can be seen as instantiations of a general sublinear algorithm skeleton. During processing, such algorithms shift or jump through the text in a forward or left-to-right direction, and read backward or right-to-left starting from positions in the text, i.e. they read suffixes of certain prefixes of the text. They use finite automata for efficient computation of string membership in a certain language. In addition, we show shift functions defined for the suffix-based algorithms to be reusable for factor- and factor oracle-based algorithms. The taxonomy is based on deriving the algorithms from a common starting point by adding algorithm and problem details, to arrive at efficient or well-known algorithms. Such a presentation provides correctness arguments for the algorithms as well as clarity on how the algorithms are related to one another. In addition, it is helpful in the construction of a toolkit of the algorithms.",
author = "L.G.W.A. Cleophas and B.W. Watson and G. Zwaan",
year = "2010",
doi = "10.1016/j.scico.2010.04.012",
language = "English",
volume = "75",
pages = "1095--1112",
journal = "Science of Computer Programming",
issn = "0167-6423",
publisher = "Elsevier",
number = "11",

}

A new taxonomy of sublinear right-to-left scanning keyword pattern matching algorithms. / Cleophas, L.G.W.A.; Watson, B.W.; Zwaan, G.

In: Science of Computer Programming, Vol. 75, Nr. 11, 2010, blz. 1095-1112.

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

TY - JOUR

T1 - A new taxonomy of sublinear right-to-left scanning keyword pattern matching algorithms

AU - Cleophas,L.G.W.A.

AU - Watson,B.W.

AU - Zwaan,G.

PY - 2010

Y1 - 2010

N2 - A new taxonomy of sublinear (multiple) keyword pattern matching algorithms is presented. Based on an earlier taxonomy by the second and third author, this new taxonomy includes not only suffix-based algorithms, but also factor- and factor oracle-based algorithms. In particular, we show how suffix-based (Commentz-Walter like), factor- and factor oracle-based sublinear keyword pattern matching algorithms can be seen as instantiations of a general sublinear algorithm skeleton. During processing, such algorithms shift or jump through the text in a forward or left-to-right direction, and read backward or right-to-left starting from positions in the text, i.e. they read suffixes of certain prefixes of the text. They use finite automata for efficient computation of string membership in a certain language. In addition, we show shift functions defined for the suffix-based algorithms to be reusable for factor- and factor oracle-based algorithms. The taxonomy is based on deriving the algorithms from a common starting point by adding algorithm and problem details, to arrive at efficient or well-known algorithms. Such a presentation provides correctness arguments for the algorithms as well as clarity on how the algorithms are related to one another. In addition, it is helpful in the construction of a toolkit of the algorithms.

AB - A new taxonomy of sublinear (multiple) keyword pattern matching algorithms is presented. Based on an earlier taxonomy by the second and third author, this new taxonomy includes not only suffix-based algorithms, but also factor- and factor oracle-based algorithms. In particular, we show how suffix-based (Commentz-Walter like), factor- and factor oracle-based sublinear keyword pattern matching algorithms can be seen as instantiations of a general sublinear algorithm skeleton. During processing, such algorithms shift or jump through the text in a forward or left-to-right direction, and read backward or right-to-left starting from positions in the text, i.e. they read suffixes of certain prefixes of the text. They use finite automata for efficient computation of string membership in a certain language. In addition, we show shift functions defined for the suffix-based algorithms to be reusable for factor- and factor oracle-based algorithms. The taxonomy is based on deriving the algorithms from a common starting point by adding algorithm and problem details, to arrive at efficient or well-known algorithms. Such a presentation provides correctness arguments for the algorithms as well as clarity on how the algorithms are related to one another. In addition, it is helpful in the construction of a toolkit of the algorithms.

U2 - 10.1016/j.scico.2010.04.012

DO - 10.1016/j.scico.2010.04.012

M3 - Article

VL - 75

SP - 1095

EP - 1112

JO - Science of Computer Programming

T2 - Science of Computer Programming

JF - Science of Computer Programming

SN - 0167-6423

IS - 11

ER -