Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7420

SpanishLightStemmer stemming errors

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • modules/analysis
    • New, Patch Available

    Description

      SpanishLightStemmer only applies stemming if the original word is 5 characters or longer. It tries to removing gender and numbers, removing final vowel and s if they exist.

      So, perro, perra, perros and perras (dog, bitch, dogs and bitches) all become 'perr'.
      The problem arises with shorter words. Gatos and Gatas (cats / female cats) all become 'gat', but the singular forms (gato, gata) are below the 5 character threshold, so they wouldn't be stemmed.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jpardos J Pardos
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: