UIMA
  1. UIMA
  2. UIMA-2110

Turn the HMMTagger class into a more generic class for tagging tasks

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.3
    • Fix Version/s: 2.3.1Addons
    • Component/s: Sandbox-Tagger
    • Labels:
      None
    • Environment:

      Description

      Despite its name, the code of the org.apache.uima.examples.tagger.HMMTagger
      class is not totally independant from the pos tagging task.
      In addition it assumes that the feature path to update with the result of the
      tagging is org.apache.uima.TokenAnnotation:posTag.

      We propose to let the possibility to users to specify by parameter the feature
      path to set. This parameter is optional. If it is left free, the tagger will
      work as usually using the org.apache.uima.TokenAnnotation:posTag as default value.

      By the way, we propose to add three optional parameters : InputView, SentenceType and ModelFile.
      Since the HMM Learner has got the possibility to specify the view to use to
      train a model, we consequently decide to give the same possibility for the
      tagger. By default, it works on the _InitialView. It is actually quite useful in practice!

      The org.apache.uima.TokenAnnotation type is not the only annotation type which is assumed
      to be present in the CAS. Actually, the HMMTagger processes tokens sentence by sentence. It uses the
      org.apache.uima.SentenceAnnotation to select the tokens. The SentenceType parameter aims at
      letting the users free to specify their own sentence annotation Type. The default value is
      org.apache.uima.SentenceAnnotation.

      The ModelFile parameter is a concurrent way to the resource declaration way to specify a model.
      Left empty, it won t be considered. Otherwise it will predomine over the resource declaration.
      When specified, the multiple deployement of the tagger cannot be allowed but in practice for the user it may be easier to configure a parameter through Eclipse.

      Two distincts patches will be provided, one for the class and the other for the descriptor.

      Future improvement of the class might offer the possibility to create new annotations not only to update existing ones.
      Future improvement of the descriptor may dissociate what it is up to the tagger and what it is relevant for the pos tagger...

      1. UIMA2110updated.patch
        14 kB
        Tommaso Teofili
      2. AMoreGenericHMMTaggerDesc.patch
        11 kB
        Nicolas Hernandez
      3. AMoreGenericHMMTaggerSrcClass.patch
        9 kB
        Nicolas Hernandez

        Activity

        Nicolas Hernandez created issue -
        Nicolas Hernandez made changes -
        Field Original Value New Value
        Attachment AMoreGenericHMMTaggerSrcClass.patch [ 12475624 ]
        Nicolas Hernandez made changes -
        Attachment AMoreGenericHMMTaggerDesc.patch [ 12475625 ]
        Tommaso Teofili made changes -
        Fix Version/s 2.3.1Addons [ 12316093 ]
        Tommaso Teofili made changes -
        Fix Version/s 2.3.1 [ 12314751 ]
        Fix Version/s 2.3.1Addons [ 12316093 ]
        Tommaso Teofili made changes -
        Attachment UIMA2110updated.patch [ 12485127 ]
        Tommaso Teofili made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Assignee Tommaso Teofili [ teofili ]
        Resolution Fixed [ 1 ]
        Tommaso Teofili made changes -
        Fix Version/s 2.3.1Addons [ 12316093 ]
        Tommaso Teofili made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Tommaso Teofili made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Marshall Schor made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Tommaso Teofili
            Reporter:
            Nicolas Hernandez
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 1.5h
              1.5h
              Remaining:
              Remaining Estimate - 1.5h
              1.5h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development