Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-958

Document Analyzer not showing PersonTitle when running with xml tagged source

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 2.3
    • Tools
    • None

    Description

      Running the document analyzer with an input source of xml docs, and specifying the xml tag (of TEXT) causes the personTitle annotator to not show results. This is traced to it being given a language of x-unspecified, even though the collection reader used set the document language to "en". This is traced to the insertion of the xmldetagger into the aggregate, which passed a cas view of "plain text" to the personTitle annotator. That component did not copy the language specifier from the input cas view. In fact, the input cas view was "xmlDocument" - and that view also didn't have the language. The collection reader put the language only into the "_InitialView". Fix by having the XmlDetagger copy the initial view's language into the resulting plain text view it creates for downstream annotators to work on.

      Note that there are two copies of this class - fix both of them (One in tools, other in examples).

      Attachments

        Activity

          People

            schor Marshall Schor
            schor Marshall Schor
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: