Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-5791

UIMA-AS: fix client SAXParseException when deserializing metadata

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0AS
    • Async Scaleout
    • None

    Description

      XML parser fails with SAXParseException when trying to deserialize service metadata. The scenario which causes the error is:

      UIMA-AS client running on windows

      Service runs on linux

      The client sends getMeta request and receives a response from a service. The client tries to deserialize the meta and gets:

      Jun 06, 2018 2:25:10 PM org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineCommon_impl$2 onMessageWARNING: org.apache.uima.util.InvalidXMLException: Invalid descriptor at <unknown source>.at org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:219)at org.apache.uima.util.impl.XMLParser_impl.parseResourceMetaData(XMLParser_impl.java:438)at org.apache.uima.util.impl.XMLParser_impl.parseResourceMetaData(XMLParser_impl.java:420)at org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineCommon_impl.handleMetadataReply(BaseUIMAAsynchronousEngineCommon_impl.java:1178)at org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineCommon_impl$2.run(BaseUIMAAsynchronousEngineCommon_impl.java:2065)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)at java.lang.Thread.run(Thread.java:811)Caused by: org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence.at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)at org.apache.uima.util.impl.XMLParser_impl.parse(XMLParser_impl.java:202)... 7 more

       

      A workaround for the above was to set: -D"file.encoding-UTF-8" on the client.

      Review the code and provided a fix. Perhaps XML InputSource has a way to set encoding. The default should be UTF-8. Seems like we need a new uima-as a new property (or command line arg) to override the default in case a user needs different encoding.

       

      Attachments

        Activity

          People

            cwiklik Jaroslaw Cwiklik
            cwiklik Jaroslaw Cwiklik
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: