Uploaded image for project: 'OpenNLP'
  1. OpenNLP
  2. OPENNLP-549

Inconsistent handling of lower-/upper- case POS tags in the JWNLDictionary.getLemmas method

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: tools-1.5.2-incubating
    • Fix Version/s: tools-1.5.3
    • Component/s: Coref
    • Labels:
      None

      Description

      JWNLDictionary.getLemmas method has a code that maps string-valued POS tags to net.didion.jwnl.data.POS instances, and this code clearly has a bug (for example, it maps tags starting with "N" both to POS.NOUN and POS.VERB). Please see the patch below (the patch assumes Penn TreeBank tags).

      Index: src/main/java/opennlp/tools/coref/mention/JWNLDictionary.java
      ===================================================================
      — src/main/java/opennlp/tools/coref/mention/JWNLDictionary.java (revision 1410284)
      +++ src/main/java/opennlp/tools/coref/mention/JWNLDictionary.java (working copy)
      @@ -84,10 +84,10 @@
      if (tag.startsWith("N") || tag.startsWith("n"))

      { pos = POS.NOUN; }
      • else if (tag.startsWith("N") || tag.startsWith("v")) {
        + else if (tag.startsWith("V") || tag.startsWith("v")) { pos = POS.VERB; }
      • else if (tag.startsWith("J") || tag.startsWith("a")) {
        + else if (tag.startsWith("J") || tag.startsWith("j")) { pos = POS.ADJECTIVE; }

        else if (tag.startsWith("R") || tag.startsWith("r")) {

        Attachments

          Activity

            People

            • Assignee:
              autayeu Aliaksandr Autayeu
              Reporter:
              alexeevg Gleb Alexeyev
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: