Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9567

JapanesePartOfSpeechStopFilterFactory should load built-in stop tags by default

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 8.6
    • None
    • modules/analysis
    • None
    • New

    Description

      If JapanesePartOfSpeechStopFilterFactory is given empty args, it does nothing. It doesn't load any stop tags, and just passes along the TokenStream passed to create().

      As a default behavior, this is trappy, since a user may add the filter without explicitly adding any arguments and assume that it would load a "default" stop set. Or they may assume that if an explicit argument is required then an exception will be thrown. Regardless, "doing nothing" is almost certainly not what the user intended.

      I'm going to attach a patch to load the default stop tags (using JapaneseAnalyzer.getDefaultStopTags()) if no args are specified, which probably makes sense in 9.0 (as it's consistent with e.g. KoreanPartOfSpeechStopFilterFactory). If we want to apply a fix to 8.x, maybe throw an exception to let the use know that the FilterFactory probably isn't doing what they think it's doing?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              msfroh Michael Froh
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h