Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7295

TermAutomatonQuery.hashCode calculates Automaton.toDot().hash

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.x, 7.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This is going to be excruciatingly slow? We could at least cache the hash code once computed...

        Issue Links

          Activity

          Hide
          rcmuir Robert Muir added a comment -

          I think the historical pain here was that you can't easily impl this stuff so that equals means "same language" unless you do minimization?

          Either way toDot() is crazy

          Show
          rcmuir Robert Muir added a comment - I think the historical pain here was that you can't easily impl this stuff so that equals means "same language" unless you do minimization? Either way toDot() is crazy
          Hide
          dweiss Dawid Weiss added a comment -

          Yeah, I think it may be difficult... but then I also wouldn't bet toDot() actually implies any better equivalence – I don't think it's "canonical" in any way.

          This should be investigated, I'm working through the LUCENE-7277 first though, trying to make code patterns more consistent (and perhaps a bit easier on the eyes).

          Show
          dweiss Dawid Weiss added a comment - Yeah, I think it may be difficult... but then I also wouldn't bet toDot() actually implies any better equivalence – I don't think it's "canonical" in any way. This should be investigated, I'm working through the LUCENE-7277 first though, trying to make code patterns more consistent (and perhaps a bit easier on the eyes).
          Hide
          dweiss Dawid Weiss added a comment -

          I've been thinking about it and since it's really quite difficult to canonicalize an automaton then my proposed solution to this would be to implement instance equivalence instead – this can be TermAutomatonQuery equivalence or the underlying Automaton equivalence, I don't have a strong opinion on this, but it'll be actually more cache-friendly than the current way of computing hashCode by dumping everything to a (potentially huge) graphviz object...

          Show
          dweiss Dawid Weiss added a comment - I've been thinking about it and since it's really quite difficult to canonicalize an automaton then my proposed solution to this would be to implement instance equivalence instead – this can be TermAutomatonQuery equivalence or the underlying Automaton equivalence, I don't have a strong opinion on this, but it'll be actually more cache-friendly than the current way of computing hashCode by dumping everything to a (potentially huge) graphviz object...
          Hide
          mikemccand Michael McCandless added a comment -

          Woops, sorry about this mess +1 for instance equivalence on either the automaton or the query.

          Show
          mikemccand Michael McCandless added a comment - Woops, sorry about this mess +1 for instance equivalence on either the automaton or the query.
          Hide
          dweiss Dawid Weiss added a comment -

          Thx, will fix it soon.

          Show
          dweiss Dawid Weiss added a comment - Thx, will fix it soon.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit c367f51793e02220dc9f276aaa1b26c6434aa254 in lucene-solr's branch refs/heads/branch_6x from Dawid Weiss
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c367f51 ]

          LUCENE-7295: TermAutomatonQuery.hashCode calculates Automaton.toDot().hash.

          Show
          jira-bot ASF subversion and git services added a comment - Commit c367f51793e02220dc9f276aaa1b26c6434aa254 in lucene-solr's branch refs/heads/branch_6x from Dawid Weiss [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c367f51 ] LUCENE-7295 : TermAutomatonQuery.hashCode calculates Automaton.toDot().hash.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 6e8ca1a094ee8dda61f4e210e310ad26e6decacf in lucene-solr's branch refs/heads/master from Dawid Weiss
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6e8ca1a ]

          LUCENE-7295: TermAutomatonQuery.hashCode calculates Automaton.toDot().hash.

          Show
          jira-bot ASF subversion and git services added a comment - Commit 6e8ca1a094ee8dda61f4e210e310ad26e6decacf in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6e8ca1a ] LUCENE-7295 : TermAutomatonQuery.hashCode calculates Automaton.toDot().hash.

            People

            • Assignee:
              dweiss Dawid Weiss
              Reporter:
              dweiss Dawid Weiss
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development