Uploaded image for project: 'OpenNLP'
  1. OpenNLP
  2. OPENNLP-660

Include list of stop words for various languages

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • tools-1.5.3
    • None
    • Parser, Stemmer
    • all

    Description

      This feature request is for inclusion of list of stop words for various languages. These stop word lists can be used to reduce the noise caused by by frequent but irrelevant words, e.g. when tokenizing texts. The list could be a simple list of words for a first iteration, but could also include multi-stopwords, which will apply to n-grams (i.e. a word in the list will serve to "stop" a multi-word n-gram).

      Attachments

        Activity

          People

            Unassigned Unassigned
            mwunderlich Martin Wunderlich
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 0.05h
                0.05h
                Remaining:
                Remaining Estimate - 0.05h
                0.05h
                Logged:
                Time Spent - Not Specified
                Not Specified