Lucene - Core
  1. Lucene - Core
  2. LUCENE-5480

Hunspell shouldnt merge dictionary entries

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.8, Trunk
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Ive been writing lots of little unit tests for this thing, and I'm pretty positive i screwed this up in LUCENE-5468... sorry

      Otherwise the whole "prefix-suffix dependencies" described in the manpage won't work.

      Either 'words' should be changed from FST<Long> to FST<IntsRef>, or when there are duplicates we should add 'padding' that we just consume (suggester-style). The latter is a little tricky, but I think this is generally uncommon so it would keep the FST smaller.

      shouldnt be hard to fix.

        Activity

        Robert Muir created issue -
        Hide
        ASF subversion and git services added a comment -

        Commit 1572841 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1572841 ]

        LUCENE-5480: add the tests i have so far... (not including this bug yet though)

        Show
        ASF subversion and git services added a comment - Commit 1572841 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1572841 ] LUCENE-5480 : add the tests i have so far... (not including this bug yet though)
        Hide
        ASF subversion and git services added a comment -

        Commit 1572842 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1572842 ]

        LUCENE-5480: add the tests i have so far... (not including this bug yet though)

        Show
        ASF subversion and git services added a comment - Commit 1572842 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1572842 ] LUCENE-5480 : add the tests i have so far... (not including this bug yet though)
        Hide
        Robert Muir added a comment -

        here is my current state. i've unraveled a few bugs with these cool little tests (the examples from the man page). I'll see how far I can get but i wanted to snapshot here since its some progress...

        Show
        Robert Muir added a comment - here is my current state. i've unraveled a few bugs with these cool little tests (the examples from the man page). I'll see how far I can get but i wanted to snapshot here since its some progress...
        Robert Muir made changes -
        Field Original Value New Value
        Attachment LUCENE-5480.patch [ 12631803 ]
        Hide
        Robert Muir added a comment -

        I think the current bug is a longstanding one, because prefix and suffix stripping is not intertwined (so continuation classes from prefixes dont apply to suffixes and so on).

        This causes overstemming today.

        I'd like to fix the current bug(s) here with the uploaded patch and open a followup issue for that... its progress.

        Show
        Robert Muir added a comment - I think the current bug is a longstanding one, because prefix and suffix stripping is not intertwined (so continuation classes from prefixes dont apply to suffixes and so on). This causes overstemming today. I'd like to fix the current bug(s) here with the uploaded patch and open a followup issue for that... its progress.
        Hide
        ASF subversion and git services added a comment -

        Commit 1573048 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1573048 ]

        LUCENE-5480: Hunspell shouldn't merge dictionary entries

        Show
        ASF subversion and git services added a comment - Commit 1573048 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1573048 ] LUCENE-5480 : Hunspell shouldn't merge dictionary entries
        Hide
        ASF subversion and git services added a comment -

        Commit 1573057 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1573057 ]

        LUCENE-5480: Hunspell shouldn't merge dictionary entries

        Show
        ASF subversion and git services added a comment - Commit 1573057 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1573057 ] LUCENE-5480 : Hunspell shouldn't merge dictionary entries
        Robert Muir made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 4.8 [ 12326269 ]
        Fix Version/s 5.0 [ 12321663 ]
        Resolution Fixed [ 1 ]
        Hide
        Uwe Schindler added a comment -

        Close issue after release of 4.8.0

        Show
        Uwe Schindler added a comment - Close issue after release of 4.8.0
        Uwe Schindler made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        12h 45m 1 Robert Muir 28/Feb/14 19:57
        Resolved Resolved Closed Closed
        58d 3h 28m 1 Uwe Schindler 28/Apr/14 00:25

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development