Description
Currently the toAutomaton() method of the LevenshteinAutomata class uses the default constructor of the Automaton although it exactly knows how many states the automaton will have and can do a reasonable estimation on how many transitions it will need as well.
I suggest changing the lines
// the number of states is based on the length of the word and n int numStates = description.size(); Automaton a = new Automaton(); int lastState;
to
// the number of states is based on the length of the word and n final int numStates = description.size(); final int numTransitions = numStates * Math.min(1 + 2 * n, alphabet.length); final int prefixStates = prefix != null ? prefix.codePointCount(0, prefix.length()) : 0; final Automaton a = new Automaton(numStates + prefixStates, numTransitions); int lastState;
For my test data this cut down on the total amount of memory needed for int arrays by factor 4. The estimation of "1 + 2 * editDistance" should maybe rather be replaced by a value coming from the ParametricDescription used.