Abstract
Minimal acyclic deterministic finite automata (MADFAs) are used to represent dictionaries, i.e., finite sets of finite words, in, e.g., spell checkers and network security applications. Given the size of such dictionaries, which may contain millions of words, their efficient construction is a critical issue. Watson (2010) published a classification of such algorithms in an algorithm taxonomy with correctness arguments. We report on a new algorithm which constructs MADFAs in parallel---each for a keyword set from a partition of the original keyword set---and afterwards merges and minimizes the resulting automata into a single MADFA; on our experience implementing the algorithms in a Java-based toolkit; and on empirical performance results obtained.
Original language | English |
---|---|
Title of host publication | Proceedings of the Prague Stringology Conference PSC 2017, August 28-30, 2017, Prague, Czech Republic |
Editors | J. Holub, J. Zdarek |
Place of Publication | Prague |
Publisher | Czech Technical University in Prague |
Pages | 126-142 |
Number of pages | 17 |
ISBN (Print) | 978-80-01-06193-0 |
Publication status | Published - 2017 |
Event | Prague Stringology Conference 2017 - Prague, Czech Republic Duration: 28 Aug 2017 → 30 Aug 2017 |
Conference
Conference | Prague Stringology Conference 2017 |
---|---|
Abbreviated title | PSC |
Country/Territory | Czech Republic |
City | Prague |
Period | 28/08/17 → 30/08/17 |