OpenNLP
  1. OpenNLP
  2. OPENNLP-508

Add an option to create or expand a TagDictionary with training data

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: tools-1.5.3
    • Fix Version/s: tools-1.5.3
    • Component/s: POS Tagger
    • Labels:
      None

      Description

      It would be useful if we could expand or create the TagDictionary while training a POS Tagger model.

      I propose that we add a new command line argument, -tagDictCutoff, that would trigger the creation / expansion of the dictionary. The cutoff would represent the minimun number of occurrences that a word tag pair would occur in the training data before it is added to the dictionary.

      Further information can be found on this conversation: http://mail-archives.apache.org/mod_mbox/opennlp-dev/201205.mbox/%3CCA%2BiWThJNQzLSc3NmDLbEzaORDWnFgbk_id3SJjuELVRSoMTJzQ%40mail.gmail.com%3E

        Activity

        William Colen created issue -
        Hide
        William Colen added a comment -

        Now we can optionally create the TagDictionary using the training data. If performing cross-validation, it will add only training data to the dictionary.

        Show
        William Colen added a comment - Now we can optionally create the TagDictionary using the training data. If performing cross-validation, it will add only training data to the dictionary.
        William Colen made changes -
        Field Original Value New Value
        Status Open [ 1 ] Closed [ 6 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            William Colen
            Reporter:
            William Colen
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development