Class StringsToAutomaton

java.lang.Object
org.apache.lucene.util.automaton.StringsToAutomaton

final class StringsToAutomaton extends Object
Builds a minimal, deterministic Automaton that accepts a set of strings using the algorithm described in Incremental Construction of Minimal Acyclic Finite-State Automata by Daciuk, Mihov, Watson and Watson. This requires sorted input data, but is very fast (nearly linear with the input size). Also offers the ability to directly build a binary Automaton representation. Users should access this functionality through Automata static methods.
See Also:
  • Field Details

  • Constructor Details

    • StringsToAutomaton

      private StringsToAutomaton()
      The default constructor is private. Use static methods directly.
  • Method Details

    • setPrevious

      private boolean setPrevious(BytesRef current)
      Copy current into an internal buffer.
    • convert

      Internal recursive traversal for conversion.
    • completeAndConvert

      private Automaton completeAndConvert()
      Called after adding all terms. Performs final minimization and converts to a standard Automaton instance.
    • build

      static Automaton build(Iterable<BytesRef> input, boolean asBinary)
      Build a minimal, deterministic automaton from a sorted list of BytesRef representing strings in UTF-8. These strings must be binary-sorted. Creates an Automaton with either UTF-8 codepoints as transition labels or binary (compiled) transition labels based on asBinary.
    • build

      static Automaton build(BytesRefIterator input, boolean asBinary) throws IOException
      Build a minimal, deterministic automaton from a sorted list of BytesRef representing strings in UTF-8. These strings must be binary-sorted. Creates an Automaton with either UTF-8 codepoints as transition labels or binary (compiled) transition labels based on asBinary.
      Throws:
      IOException
    • add

      private void add(BytesRef current, boolean asBinary)
    • replaceOrRegister

      private void replaceOrRegister(StringsToAutomaton.State state)
      Replace last child of state with an already registered state or stateRegistry the last child state.