OpenNLP
  1. OpenNLP
  2. OPENNLP-479

Features related to abbreviation dictionary are not properly collected by DefaultSDContextGenerator

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: tools-1.5.3
    • Fix Version/s: tools-1.5.3
    • Component/s: Sentence Detector
    • Labels:
      None

      Description

      The documentation is not clear about if the entries in abbreviation dictionary should include the EOS character. For example "mr" or "mr.". Also, part of the collector code expects the dictionary to include the EOS character, and others don't.

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Closed Closed
        51d 4h 30m 1 William Colen 08/May/12 20:24
        Hide
        adithya renduhcintala added a comment -

        Hi, I am trying to add short forms and abbreviations to my sentence detector (using the java api) but my SD still does not detect abbreations, and splits sentences when it should not.

        Is there a code snippet to use the abbreviationDictonary when training a sentence detector?

        Show
        adithya renduhcintala added a comment - Hi, I am trying to add short forms and abbreviations to my sentence detector (using the java api) but my SD still does not detect abbreations, and splits sentences when it should not. Is there a code snippet to use the abbreviationDictonary when training a sentence detector?
        William Colen made changes -
        Field Original Value New Value
        Status Open [ 1 ] Closed [ 6 ]
        Resolution Fixed [ 1 ]
        Hide
        William Colen added a comment -

        Fixed.

        Show
        William Colen added a comment - Fixed.
        Hide
        William Colen added a comment -

        I changed the DefaultSDContextGenerator assuming that the correct is to have abbreviations with the form "mr.". Please review.

        Show
        William Colen added a comment - I changed the DefaultSDContextGenerator assuming that the correct is to have abbreviations with the form "mr.". Please review.
        William Colen created issue -

          People

          • Assignee:
            William Colen
            Reporter:
            William Colen
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development