Many-MADFAct: concurrently constructing MADFAs

T. Runge, I. Schaefer, L.G.W.A. Cleophas, B.W. Watson

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

13 Downloads (Pure)

Abstract

Minimal acyclic deterministic finite automata (MADFAs) are used to represent dictionaries, i.e., finite sets of finite words, in, e.g., spell checkers and network security applications. Given the size of such dictionaries, which may contain millions of words, their efficient construction is a critical issue. Watson (2010) published a classification of such algorithms in an algorithm taxonomy with correctness arguments. We report on a new algorithm which constructs MADFAs in parallel---each for a keyword set from a partition of the original keyword set---and afterwards merges and minimizes the resulting automata into a single MADFA; on our experience implementing the algorithms in a Java-based toolkit; and on empirical performance results obtained.
Original languageEnglish
Title of host publicationProceedings of the Prague Stringology Conference PSC 2017, August 28-30, 2017, Prague, Czech Republic
EditorsJ. Holub, J. Zdarek
Place of PublicationPrague
PublisherCzech Technical University in Prague
Pages126-142
Number of pages17
ISBN (Print)978-80-01-06193-0
Publication statusPublished - 2017
EventPrague Stringology Conference 2017 - Prague, Czech Republic
Duration: 28 Aug 201730 Aug 2017

Conference

ConferencePrague Stringology Conference 2017
Abbreviated titlePSC
CountryCzech Republic
CityPrague
Period28/08/1730/08/17

Fingerprint

Finite automata
Glossaries
Network security
Taxonomies

Cite this

Runge, T., Schaefer, I., Cleophas, L. G. W. A., & Watson, B. W. (2017). Many-MADFAct: concurrently constructing MADFAs. In J. Holub, & J. Zdarek (Eds.), Proceedings of the Prague Stringology Conference PSC 2017, August 28-30, 2017, Prague, Czech Republic (pp. 126-142). Prague: Czech Technical University in Prague.
Runge, T. ; Schaefer, I. ; Cleophas, L.G.W.A. ; Watson, B.W. / Many-MADFAct : concurrently constructing MADFAs. Proceedings of the Prague Stringology Conference PSC 2017, August 28-30, 2017, Prague, Czech Republic . editor / J. Holub ; J. Zdarek. Prague : Czech Technical University in Prague, 2017. pp. 126-142
@inproceedings{44b3c3a11e4543ccbfd42d3ea30caf08,
title = "Many-MADFAct: concurrently constructing MADFAs",
abstract = "Minimal acyclic deterministic finite automata (MADFAs) are used to represent dictionaries, i.e., finite sets of finite words, in, e.g., spell checkers and network security applications. Given the size of such dictionaries, which may contain millions of words, their efficient construction is a critical issue. Watson (2010) published a classification of such algorithms in an algorithm taxonomy with correctness arguments. We report on a new algorithm which constructs MADFAs in parallel---each for a keyword set from a partition of the original keyword set---and afterwards merges and minimizes the resulting automata into a single MADFA; on our experience implementing the algorithms in a Java-based toolkit; and on empirical performance results obtained.",
author = "T. Runge and I. Schaefer and L.G.W.A. Cleophas and B.W. Watson",
year = "2017",
language = "English",
isbn = "978-80-01-06193-0",
pages = "126--142",
editor = "J. Holub and J. Zdarek",
booktitle = "Proceedings of the Prague Stringology Conference PSC 2017, August 28-30, 2017, Prague, Czech Republic",
publisher = "Czech Technical University in Prague",
address = "Czech Republic",

}

Runge, T, Schaefer, I, Cleophas, LGWA & Watson, BW 2017, Many-MADFAct: concurrently constructing MADFAs. in J Holub & J Zdarek (eds), Proceedings of the Prague Stringology Conference PSC 2017, August 28-30, 2017, Prague, Czech Republic . Czech Technical University in Prague, Prague, pp. 126-142, Prague Stringology Conference 2017, Prague, Czech Republic, 28/08/17.

Many-MADFAct : concurrently constructing MADFAs. / Runge, T.; Schaefer, I.; Cleophas, L.G.W.A.; Watson, B.W.

Proceedings of the Prague Stringology Conference PSC 2017, August 28-30, 2017, Prague, Czech Republic . ed. / J. Holub; J. Zdarek. Prague : Czech Technical University in Prague, 2017. p. 126-142.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Many-MADFAct

T2 - concurrently constructing MADFAs

AU - Runge, T.

AU - Schaefer, I.

AU - Cleophas, L.G.W.A.

AU - Watson, B.W.

PY - 2017

Y1 - 2017

N2 - Minimal acyclic deterministic finite automata (MADFAs) are used to represent dictionaries, i.e., finite sets of finite words, in, e.g., spell checkers and network security applications. Given the size of such dictionaries, which may contain millions of words, their efficient construction is a critical issue. Watson (2010) published a classification of such algorithms in an algorithm taxonomy with correctness arguments. We report on a new algorithm which constructs MADFAs in parallel---each for a keyword set from a partition of the original keyword set---and afterwards merges and minimizes the resulting automata into a single MADFA; on our experience implementing the algorithms in a Java-based toolkit; and on empirical performance results obtained.

AB - Minimal acyclic deterministic finite automata (MADFAs) are used to represent dictionaries, i.e., finite sets of finite words, in, e.g., spell checkers and network security applications. Given the size of such dictionaries, which may contain millions of words, their efficient construction is a critical issue. Watson (2010) published a classification of such algorithms in an algorithm taxonomy with correctness arguments. We report on a new algorithm which constructs MADFAs in parallel---each for a keyword set from a partition of the original keyword set---and afterwards merges and minimizes the resulting automata into a single MADFA; on our experience implementing the algorithms in a Java-based toolkit; and on empirical performance results obtained.

M3 - Conference contribution

SN - 978-80-01-06193-0

SP - 126

EP - 142

BT - Proceedings of the Prague Stringology Conference PSC 2017, August 28-30, 2017, Prague, Czech Republic

A2 - Holub, J.

A2 - Zdarek, J.

PB - Czech Technical University in Prague

CY - Prague

ER -

Runge T, Schaefer I, Cleophas LGWA, Watson BW. Many-MADFAct: concurrently constructing MADFAs. In Holub J, Zdarek J, editors, Proceedings of the Prague Stringology Conference PSC 2017, August 28-30, 2017, Prague, Czech Republic . Prague: Czech Technical University in Prague. 2017. p. 126-142