Uploaded image for project: 'Camel'
  1. Camel
  2. CAMEL-13374

XMLTokenExpressionIterator Default Exchange charset overrides original xml encoding from InputStream

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.18.0, 2.22.0
    • 3.10.0
    • camel-core
    • None
    • Unknown

    Description

      Default Exchange charset overrides original xml encoding from InputStream

      at

      org.apache.camel.support.XMLTokenExpressionIterator.doEvaluate(Exchange exchange, boolean closeStream)

      String charset = IOHelper.getCharsetName(exchange);

      must be replaced with

      String charset = IOHelper.getCharsetName(exchange, false);

      then at 

      // woodstox's getLocation().etCharOffset() does not return the offset correctly for InputStream, so use Reader instead.
      this(path, nsmap, mode, group, new InputStreamReader(in, charset));

      and 

      // woodstox's getLocation().etCharOffset() does not return the offset correctly for InputStream, so use Reader instead.
      this(path, nsmap, mode, 1, new InputStreamReader(in, charset));

      lines use 

      org.apache.commons.io.input.XmlStreamReader instead of just InputStreamReader

      it correctly determinants encoding from xml header when it present.

       

      Examle document at InputStream body:

      <?xml version = "1.0" encoding= "ISO-8859-5" standalone="no" ?>

      <xml/>

      Current charset result: is UTF-8 (default from IOHelper.getCharsetName(exchange))

      Expected result: ISO-8859-5

       

      Attachments

        Issue Links

          Activity

            People

              davsclaus Claus Ibsen
              smithsv Sergey Smith
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: