Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2206

Provide example scoring.similarity.stopword.file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.11
    • 1.12
    • plugin, scoring
    • None

    Description

      The scoring-similarity plugin does not provide an example file for the property scoring.similarity.stopword.file.
      This is an issue for a number of reasons, namely

      • A user does not know what it is meant to look like, and
      • We always check of this file and will throw an exception if it is not found, this may not be picked up by the user until much later.

      I suggest a simple fix here, simply include the standard English stop words taken from Lucene's StopAnalyzer. The comments will help people to easily customize the list to whatever they require.

      Attachments

        1. NUTCH-2206.patch
          3 kB
          Sujen Shah
        2. NUTCH-2206.patch
          2 kB
          Sujen Shah

        Activity

          People

            lewismc Lewis John McGibbney
            lewismc Lewis John McGibbney
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: