Uploaded image for project: 'OpenNLP'
  1. OpenNLP
  2. OPENNLP-479

Features related to abbreviation dictionary are not properly collected by DefaultSDContextGenerator

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: tools-1.5.3
    • Fix Version/s: tools-1.5.3
    • Component/s: Sentence Detector
    • Labels:
      None

      Description

      The documentation is not clear about if the entries in abbreviation dictionary should include the EOS character. For example "mr" or "mr.". Also, part of the collector code expects the dictionary to include the EOS character, and others don't.

        Activity

        Hide
        arenduch adithya renduhcintala added a comment -

        Hi, I am trying to add short forms and abbreviations to my sentence detector (using the java api) but my SD still does not detect abbreations, and splits sentences when it should not.

        Is there a code snippet to use the abbreviationDictonary when training a sentence detector?

        Show
        arenduch adithya renduhcintala added a comment - Hi, I am trying to add short forms and abbreviations to my sentence detector (using the java api) but my SD still does not detect abbreations, and splits sentences when it should not. Is there a code snippet to use the abbreviationDictonary when training a sentence detector?
        Hide
        colen William Colen added a comment -

        Fixed.

        Show
        colen William Colen added a comment - Fixed.
        Hide
        colen William Colen added a comment -

        I changed the DefaultSDContextGenerator assuming that the correct is to have abbreviations with the form "mr.". Please review.

        Show
        colen William Colen added a comment - I changed the DefaultSDContextGenerator assuming that the correct is to have abbreviations with the form "mr.". Please review.

          People

          • Assignee:
            colen William Colen
            Reporter:
            colen William Colen
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development