Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.9, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      previously this was 3 hashes (prefixes, words, suffixes) and it tried to split the words in various ways and do lookups. This was changed to FST, but the algorithm wasn't adjusted to use it properly (e.g. single pass, terminate when it reaches a "dead end").

      this makes for slower indexing when using this stemmer...

        Activity

        Hide
        Robert Muir added a comment -

        Here's a patch.

        Reusing my previous benchmark (with polish, see last comment SOLR-3245), indexing speed increases from 2400 docs/second to 2900 docs/second. So its not much of a relative increase in speed (due to some properties of this dictionary), but still I think its worth it. And of course its much better compared to 71 docs/second in Lucene 4.7...

        Show
        Robert Muir added a comment - Here's a patch. Reusing my previous benchmark (with polish, see last comment SOLR-3245 ), indexing speed increases from 2400 docs/second to 2900 docs/second. So its not much of a relative increase in speed (due to some properties of this dictionary), but still I think its worth it. And of course its much better compared to 71 docs/second in Lucene 4.7...
        Hide
        Michael McCandless added a comment -

        +1, looks good!

        Show
        Michael McCandless added a comment - +1, looks good!
        Hide
        ASF subversion and git services added a comment -

        Commit 1587162 from rmuir@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1587162 ]

        LUCENE-5603: fix hunspell to use FST efficiently

        Show
        ASF subversion and git services added a comment - Commit 1587162 from rmuir@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1587162 ] LUCENE-5603 : fix hunspell to use FST efficiently
        Hide
        ASF subversion and git services added a comment -

        Commit 1587163 from rmuir@apache.org in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1587163 ]

        LUCENE-5603: fix hunspell to use FST efficiently

        Show
        ASF subversion and git services added a comment - Commit 1587163 from rmuir@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1587163 ] LUCENE-5603 : fix hunspell to use FST efficiently

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development