Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9410

German/French stemmers fail for common forms maux, gegrüßt, grüßend, schlummert

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 8.5
    • None
    • modules/analysis
    • Elasticsearch 7.7.1 running on cloud.elastic.co

    • New

    Description

      I'm using Lucene via Elasticsearch 7.7.1. German and French stemmers (either via the Snowball analyzer, or the "light" or "heavy" stemming analyzers) are failing to understand some common forms:

      French:

      • "maux" (plural) should match "mal" (singular) but instead "maux" is unchanged

      German:

      • "schlummert" should match "schlummern" (infinitive) but instead is unchanged
      • "grüßend" should match "grüßen" (infinitive) but instead yields "grussend"
      • "gegrüßt" should match "grüßen" (infinitive) but instead yields "gegrusst"

      The Elasticsearch folks said I should file a bug with Lucene.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bkazez Ben Kazez
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m