Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2206

Provide example scoring.similarity.stopword.file

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.11
    • Fix Version/s: 1.12
    • Component/s: plugin, scoring
    • Labels:
      None

      Description

      The scoring-similarity plugin does not provide an example file for the property scoring.similarity.stopword.file.
      This is an issue for a number of reasons, namely

      • A user does not know what it is meant to look like, and
      • We always check of this file and will throw an exception if it is not found, this may not be picked up by the user until much later.

      I suggest a simple fix here, simply include the standard English stop words taken from Lucene's StopAnalyzer. The comments will help people to easily customize the list to whatever they require.

        Attachments

        1. NUTCH-2206.patch
          3 kB
          Sujen Shah
        2. NUTCH-2206.patch
          2 kB
          Sujen Shah

          Activity

            People

            • Assignee:
              lewismc Lewis John McGibbney
              Reporter:
              lewismc Lewis John McGibbney
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: