Uploaded image for project: 'OpenNLP'
  1. OpenNLP
  2. OPENNLP-203

UIMA Sentence Detector Trainer builds models which do not split correctly the sentences

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • tools-1.5.1-incubating
    • tools-1.5.2-incubating
    • None

    Description

      The models trained with the UIMA component give wrong begin/end offset despite the fact they manage to split text in sentences.
      I observed that the begin of a current sentence starts including as a first token the punctuation character of the previous one while the
      previous one does not include it as its last one.

      Attachments

        Activity

          People

            joern Jörn Kottmann
            nicolas.hernandez Nicolas Hernandez
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: