Weak factor automata: Comparing (failure) oracles and storacles

Loek Cleophas, Derrick G. Kourie, Bruce W. Watson

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    3 Citations (Scopus)

    Abstract

    The factor oracle [3] is a data structure for weak factor recognition. It is a deterministic finite automaton (DFA) built on a string p of length m that is acyclic, recognizes at least all factors of p, has m+ 1 states which are all final, is homogeneous, and has m to 2m-1 transitions. The factor storacle [6] is an alternative automaton that satisfies the same properties, except that its number of transitions may be larger than 2m-1, although it is conjectured to be linear with an upper bound of at most 3m. In [14] (among others), we described the concept of a failure automaton i.e. a failure DFA (FDFA), in which so-called failure transitions are used to reduce the total number of transitions and thus reduce representation space compared to the use of a DFA. We modify factor oracle and storacle construction algorithms to introduce failure arcs during the respective automata's construction. We thus end up with four deterministic automata types for weak factor recognition: factor oracle, factor storacle, failure factor oracle, and failure factor storacle. We compare them empirically in terms of size. The results show that despite the relative simplicity of (failure) factor (st)oracles, the failure versions show additional savings of 2-7% in number of transitions, for generated keywords of length 5-9, and of e.g. 5-9% for English words of lengths around 9-15. This may already be substantial in memory-restricted settings such as hardware implementations of automata. The results indicate the gains increase for longer keywords, which seems promising for applications in DNA processing and intrusion detection. Furthermore, our results provide a rather negative result on storacles: apart from rare cases, factor storacles do not have fewer transitions than factor oracles, and similarly for failure factor storacles versus failure factor oracles.

    Original languageEnglish
    Title of host publicationProceedings of the Prague Stringology Conference 2013, PSC 2013, 2-4 September 2013, Prague, Czech Republic
    Pages176-190
    Number of pages15
    Publication statusPublished - 2013
    EventPrague Stringology Conference 2013, PSC 2013 - Prague, Czech Republic
    Duration: 2 Sep 20134 Sep 2013

    Conference

    ConferencePrague Stringology Conference 2013, PSC 2013
    CountryCzech Republic
    CityPrague
    Period2/09/134/09/13

    Keywords

    • Approximate automaton
    • Factor oracle
    • Failure automaton
    • Pattern matching
    • Weak factor recognition

    Fingerprint Dive into the research topics of 'Weak factor automata: Comparing (failure) oracles and storacles'. Together they form a unique fingerprint.

    Cite this