Uploaded image for project: 'Abdera'
  1. Abdera
  2. ABDERA-222

Parse failures reading utf-8 xml files that have attribute values that contain non US-ASCII valid utf-8 characters

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.4.0
    • Fix Version/s: None
    • Labels:
      None
    • Environment:
      solarix x86_64, MaxOS Leopard x86_64, linux x86_64

      Description

      When parsing XML files that are items fetched by http-client 3.1

      The same items parse correctly, if written to a byte array and then a ByteArrayInputStream on the byte array, is passed to parse.
      parser.parse(response.getResponseBodyAsStream());

      Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character (NULL, unicode 0) encountered: not valid in any content
      at [row,col

      {unknown-source}

      ]: [3,56]
      at com.ctc.wstx.sr.StreamScanner.constructNullCharException(StreamScanner.java:615)
      at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:644)
      at com.ctc.wstx.sr.BasicStreamReader.readTextPrimary(BasicStreamReader.java:4554)
      at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2886)
      at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
      at org.apache.abdera.parser.stax.FOMBuilder.getNextElementToParse(FOMBuilder.java:163)
      at org.apache.abdera.parser.stax.FOMBuilder.next(FOMBuilder.java:187)

        Attachments

        1. ChunkedTransferFailure.java
          9 kB
          Jason Venner (www.prohadoop.com)

          Activity

            People

            • Assignee:
              jasnell James M Snell
              Reporter:
              jv_ning Jason Venner (www.prohadoop.com)
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: