Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-6137

Type-based filtering in Ruta rules

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Ruta
    • Labels:
      None

      Description

      The visibility concept in Ruta is not type-based but type coverage-based, which means that filtered types will hide the are they cover to the Ruta rules, i.e. these areas become invisible to the rules.

      We have a use case where we only want to hide the types from being considered in the rules, and not the covered text area where other types found in these areas should still be considered by the rules.

      We use Ruta as part of the normalization process where we have different text areas marked with annotations associated with the tags in the original content (title, abstract/summary, body, COI, authors, citations etc.), and Ruta is part of the parsing process that produces this view. Using only the content annotations Ruta is then used to markup what areas to include in a new view for doing NLP. This approach gives us maximum traceability of the normalization process.

      However, the different types of content annotations can sometimes interfere with the rules beyond our control, and our current solution leads to more awkward rules that are hard to verify, and which also leads to a less performant implementation. The problem would in our case better be solved if we were able to tell Ruta simply to ignore certain types from being considered, i.e. they are invisible to the Ruta rules. Preferably we want to be able to add and remove filtered types in the script similar to how it works with the coverage based type filter.
      Please see also this mailing list thread where a toy example of the problem is discussed:
       
      https://lists.apache.org/thread.html/604417ac76ab85fc8d87eef12d4696b89d3257b7a53719518d9f5408@<user.uima.apache.org>

        Attachments

          Activity

            People

            • Assignee:
              pkluegl Peter Klügl
              Reporter:
              mjuric Mario Juric
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: