Uploaded image for project: 'OpenNLP'
  1. OpenNLP
  2. OPENNLP-1214

use hash to avoid linear search in DefaultEndOfSentenceScanner

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Reopened
    • Minor
    • Resolution: Unresolved
    • 1.9.0
    • None
    • None
    • None

    Description

      When DefaultEndOfSentenceScanner scans a sentence, it uses linear search to check if each characters in the sentence is one of eos characters. I think we'd better use HashSet to keep eosCharacters instead of char[].

      In accordance with this replacement, I'd like to make getEndOfSentenceCharacters() deprecated because it returns char[] and nobody in OpenNLP calls it at present, and I'd like to add the equivalent method which returns Set<Character> of eos chars. Though it cannot keep the order of eos chars but I don't think it can be a problem anyway.

      Attachments

        Activity

          People

            koji Koji Sekiguchi
            koji Koji Sekiguchi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: