Uploaded image for project: 'CXF'
  1. CXF
  2. CXF-7320

Returned charset is not decoded properly



    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.1.9, 3.1.10
    • 3.1.12, 3.0.14, 3.2.0
    • JAX-RS
    • None
    • Unknown


      I have a simple CXF JAX-RS client that calls a simple web service which returns a short text/plain response (using UTF-8). However, when I try to read the entity of this response as a String, I get an exception because the client failed to decode (or maybe unquote is more accurate) the charset parameter properly.

      The client code I'm using is this:

      	public static void main(String[] args) {
      		WebTarget target = ClientBuilder.newClient().target("http://example.org").path("test");
      		Response response = target.request().get();
      		System.out.println("status: " + response.getStatus());
      		System.out.println("type: " + response.getMediaType());
      		System.out.println("entity: " + response.readEntity(String.class));

      The output I get is this:

      status: 200
      type: text/plain;charset="UTF-8"
      19:58:35.095 [main] ERROR org.apache.cxf.jaxrs.utils.JAXRSUtils - Problem with reading the data, class java.lang.String, ContentType: text/plain;charset="UTF-8".
      Exception in thread "main" javax.ws.rs.client.ResponseProcessingException: Problem with reading the data, class java.lang.String, ContentType: text/plain;charset="UTF-8".
      	at org.apache.cxf.jaxrs.impl.ResponseImpl.reportMessageHandlerProblem(ResponseImpl.java:439)
      	at org.apache.cxf.jaxrs.impl.ResponseImpl.doReadEntity(ResponseImpl.java:379)
      	at org.apache.cxf.jaxrs.impl.ResponseImpl.readEntity(ResponseImpl.java:320)
      	at org.apache.cxf.jaxrs.impl.ResponseImpl.readEntity(ResponseImpl.java:310)
      	at com.example.Main.main(Main.java:17)
      Caused by: java.io.UnsupportedEncodingException: "UTF-8"
      	at sun.nio.cs.StreamDecoder.forInputStreamReader(StreamDecoder.java:71)
      	at java.io.InputStreamReader.<init>(InputStreamReader.java:100)
      	at org.apache.cxf.helpers.IOUtils.toString(IOUtils.java:302)
      	at org.apache.cxf.helpers.IOUtils.toString(IOUtils.java:288)
      	at org.apache.cxf.jaxrs.provider.StringTextProvider.readFrom(StringTextProvider.java:45)
      	at org.apache.cxf.jaxrs.provider.StringTextProvider.readFrom(StringTextProvider.java:36)
      	at org.apache.cxf.jaxrs.utils.JAXRSUtils.readFromMessageBodyReader(JAXRSUtils.java:1374)
      	at org.apache.cxf.jaxrs.impl.ResponseImpl.doReadEntity(ResponseImpl.java:370)
      	... 3 more

      This shows that the server returned the content-type as text/plain; charset="UTF-8" - which is valid according to the RFCs, using quoted-string syntax for the charset value. (I think it recommends not quoting for this case, but quoted is still valid AFAIK, so the client should support it, shouldn't it?)

      The relevant line in StringTextProvider seems perfectly sensible:

      return IOUtils.toString(is, HttpUtils.getEncoding(mt, StandardCharsets.UTF_8.name()));

      In isolation, the HttpUtils.getEncoding method it uses also seems sensible - however, it assumes that any quoting of the MediaType parameters has already been decoded, while CXF-4504 says that they shouldn't be and it's up to whatever uses them to do that decoding.

      While I've found an easy workaround for my particular case (in part because this server always uses UTF-8), I still think it the client should handle this so a workaround isn't necessary.

      For reference, the workaround I came up with was inserting this before the readEntity call:

      		if ("\"UTF-8\"".equals(response.getMediaType().getParameters().get("charset"))) {
      			response.getHeaders().putSingle(HttpHeaders.CONTENT_TYPE, response.getMediaType().withCharset("UTF-8"));




            sergey_beryozkin Sergey Beryozkin
            edorfaus Frode Austvik
            0 Vote for this issue
            2 Start watching this issue