This is going to be excruciatingly slow? We could at least cache the hash code once computed...
Make Query.hashCode and Query.equals abstract
I think the historical pain here was that you can't easily impl this stuff so that equals means "same language" unless you do minimization?
Either way toDot() is crazy
Yeah, I think it may be difficult... but then I also wouldn't bet toDot() actually implies any better equivalence – I don't think it's "canonical" in any way.
This should be investigated, I'm working through the LUCENE-7277 first though, trying to make code patterns more consistent (and perhaps a bit easier on the eyes).
I've been thinking about it and since it's really quite difficult to canonicalize an automaton then my proposed solution to this would be to implement instance equivalence instead – this can be TermAutomatonQuery equivalence or the underlying Automaton equivalence, I don't have a strong opinion on this, but it'll be actually more cache-friendly than the current way of computing hashCode by dumping everything to a (potentially huge) graphviz object...
Woops, sorry about this mess +1 for instance equivalence on either the automaton or the query.
Thx, will fix it soon.
Commit c367f51793e02220dc9f276aaa1b26c6434aa254 in lucene-solr's branch refs/heads/branch_6x from Dawid Weiss
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c367f51 ]
LUCENE-7295: TermAutomatonQuery.hashCode calculates Automaton.toDot().hash.
Commit 6e8ca1a094ee8dda61f4e210e310ad26e6decacf in lucene-solr's branch refs/heads/master from Dawid Weiss
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6e8ca1a ]