Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-2812

Support ResultSpecification

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • uimaFIT
    • None

    Description

      Provide support for controlling the output of a component using a ResultSpecification. Consider the e.g. use-case that a component can produce a "PartOfSpeech" annotation, but it should not, because another component in the same pipeline has already produced that or will later produce it. Here some pseudocode:

      AnalysisEngineDescription aed = createPrimitiveDescription(Parser.class);
      // Tell Parser not to produce PartOfSpeech annotations
      ResultUtil.removeType(aed, PartOfSpeech.class);
      

      How to "remove" a type? UIMA requires that a ResultSpecification contains all the types that the component produces, which would normally requiring to add all types except the ones that should not be produced. uimaFIT has access to capability annotations, which it could use to pre-fill a result specification with all the types that a component could produce, allowing the user to conveniently remove the ones not required.

      How to transport the information? Unfortunately, there appears to be no way to store the ResultSpecification as part of an AnalysisEngineDescription. As far as I can see, UIMA has two ways to control the ResultSpecification for a component:

      • via the components capabilities
      • via a parameter passed to the AnalysisEngine.process method (or via setResultSpecification)

      There are two scenarios I can imagine:

      • at description time: changes to the result specification are added to the descriptor.
        • Add the ResultSpecification to the component descriptor – unfortunately is not supported by UIMA.
        • Change the capabilities. E.g. uimaFIT creates an AE descriptor with the capabilities filled in, then one could add or remove types/features there.
      • at runtime: uimaFIT could be used to acquire an initial ResultSpecification from the annotation on the AE class, which can then be modified to add/remove types/features. The final specification needs to be passed in some way into the pipeline execution code
        • along with the component descriptor: pairs of {descriptor, resultspec}

          needed to be passed to the pipeline execution code (e.g. SimplePipeline), making the API more complex.

        • as part of already instantiated components: in case of SimplePipeline, there are also non-descriptor-based methods that could be used, in which case the result specifications could be set on each component individually before passing them into the pipeline code.

      Does it fit into the uimaFIT concept? So far, it was possible to implement uimaFIT in such a way that all information pertaining to the component configuration could be reflected, configured, and stored in a descriptor, so that any UIMA execution engine could then pick up the descriptor and execute the component as it was configured. UIMA appears to be lacking the concept of a ResultSpecification as part of the descriptors. In particular, that seems to affect ability to configure results within aggregate analysis engines.

      Conclusion Since a ResultSpecification cannot be stored in a descriptor, the next best thing appears to be adding some convenience methods to change the reflected capabilities in the descriptor.

      Attachments

        Activity

          People

            Unassigned Unassigned
            rec Richard Eckart de Castilho
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: