We have a maxDeterminizedStates = 10000 limit designed to keep regexp-type queries from blowing up.
But we have an adversary that will run for 268s on my laptop before hitting exception, first reported here: https://github.com/opensearch-project/OpenSearch/issues/687
When I run the test and jstack the threads, this what I see:
This is really sad, as getCommonSuffixBytesRef() is only supposed to be an "up-front" optimization to make the actual subsequent terms-intensive part of the query faster. But it makes the whole query run for nearly 5 minutes before it does anything.
So I definitely think we should improve getCommonSuffixBytesRef to be more "best-effort". For example, it can reduce the lower bound to 1000 and catch the exception like such:
Another, maybe simpler option, is to just check that input state/transitions accounts don't exceed some low limit N.
Basically this opto is geared at stuff like leading wildcard query of "*foo". By computing that the common suffix is "foo" we can spend less CPU in the terms dictionary because we can first do a memcmp before having to run data thru any finite state machine. It's really a microopt and we shouldn't be spending whole seconds of cpu on it, ever.
But I still don't quite understand how the current limits are giving the behavior today, maybe there is a bigger issue and I don't want to shove something under the rug.