OpenNLP
  1. OpenNLP
  2. OPENNLP-479

Features related to abbreviation dictionary are not properly collected by DefaultSDContextGenerator

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: tools-1.5.3
    • Fix Version/s: tools-1.5.3
    • Component/s: Sentence Detector
    • Labels:
      None

      Description

      The documentation is not clear about if the entries in abbreviation dictionary should include the EOS character. For example "mr" or "mr.". Also, part of the collector code expects the dictionary to include the EOS character, and others don't.

        Activity

        Hide
        William Colen added a comment -

        I changed the DefaultSDContextGenerator assuming that the correct is to have abbreviations with the form "mr.". Please review.

        Show
        William Colen added a comment - I changed the DefaultSDContextGenerator assuming that the correct is to have abbreviations with the form "mr.". Please review.
        Hide
        William Colen added a comment -

        Fixed.

        Show
        William Colen added a comment - Fixed.
        Hide
        adithya renduhcintala added a comment -

        Hi, I am trying to add short forms and abbreviations to my sentence detector (using the java api) but my SD still does not detect abbreations, and splits sentences when it should not.

        Is there a code snippet to use the abbreviationDictonary when training a sentence detector?

        Show
        adithya renduhcintala added a comment - Hi, I am trying to add short forms and abbreviations to my sentence detector (using the java api) but my SD still does not detect abbreations, and splits sentences when it should not. Is there a code snippet to use the abbreviationDictonary when training a sentence detector?

          People

          • Assignee:
            William Colen
            Reporter:
            William Colen
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development