CXF
  1. CXF
  2. CXF-3518

WebClient doesn't handle responses containing a quoted-string in a header correctly

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.4, 2.3.4
    • Fix Version/s: 2.4.1, 2.3.5
    • Component/s: JAX-RS
    • Labels:
      None
    • Environment:

      JDK 1.6

    • Estimated Complexity:
      Novice

      Description

      Similar to CXF-2462, if a request returns a response header that looks like the following:

      q: "stuff with commas, and spaces"

      WebClient's response.getMetadata() returns 2 values:

      1. "stuff with commas
      2. and spaces"

      while the correct response should be:

      1. stuff with commas, and spaces

      Here's the WebClient code which reproduces the issue (as long as the server response contains a header that returns the header value above):

      Response r = WebClient.create(new URI("http://localhost:8080")).path("key/quoted").get();
      List<Object> l = r.getMetadata().get("q");
      
      1. trunk.webclient.quote.try2.diff
        10 kB
        Ka-Lok Fung
      2. trunk.wc_quote_minimal.diff
        3 kB
        Ka-Lok Fung

        Activity

        Daniel Kulp made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Sergey Beryozkin made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Assignee Sergey Beryozkin [ sergey_beryozkin ]
        Resolution Fixed [ 1 ]
        Hide
        Sergey Beryozkin added a comment -

        Patch with a minor modification has been applied, many thanks.
        I just had to update AbstractClient to check if a given value was empty or null, before testing if it was quoted

        http://svn.apache.org/viewvc?rev=1104217&view=rev

        Show
        Sergey Beryozkin added a comment - Patch with a minor modification has been applied, many thanks. I just had to update AbstractClient to check if a given value was empty or null, before testing if it was quoted http://svn.apache.org/viewvc?rev=1104217&view=rev
        Ka-Lok Fung made changes -
        Attachment trunk.webclient.quote.try2.diff [ 12479434 ]
        Hide
        Ka-Lok Fung added a comment -

        I've implemented the "good enough" proposal per the feedback from Sergey.

        As long as long all the values inside the header are appropriately quoted, they will be available to WebClient. If the values aren't quoted correctly, WebClient will apply a "best effort" approach to return the header data from the server to the client.

        I've also added/modified three system tests to component test these changes:

        • I've extended the ETag test to make sure the current WebClient behaviour remains - that is, if the server returned quoted values, WebClient also return quoted values.
        • Added a test for quoted-strings in headers
        • Added a test for malformed quoted-strings in headers & broken per spec case discussed above
        Show
        Ka-Lok Fung added a comment - I've implemented the "good enough" proposal per the feedback from Sergey. As long as long all the values inside the header are appropriately quoted, they will be available to WebClient . If the values aren't quoted correctly, WebClient will apply a "best effort" approach to return the header data from the server to the client. I've also added/modified three system tests to component test these changes: I've extended the ETag test to make sure the current WebClient behaviour remains - that is, if the server returned quoted values, WebClient also return quoted values. Added a test for quoted-strings in headers Added a test for malformed quoted-strings in headers & broken per spec case discussed above
        Hide
        Sergey Beryozkin added a comment -

        At the moment the fact that HttpURLConnection is directly dealt with is problematic. We will definitely refactor the client runtime, for the async conduit be supported better, for a local transport be supported, etc. That will have to become a priority soon enough, however we are more focused at the moment on wadl and security.

        In meantime, I tend to update real system tests to test such cases, ex, please copy & paste one of BookStore methods in systest/jaxrs and then add a test invocation in JAXRSClientServerBookTest. That will work with Jetty. I believe one can do a similar test even in rt/frontend/jaxrs, but I'll need to review how it can be done

        Cheers, Sergey

        Show
        Sergey Beryozkin added a comment - At the moment the fact that HttpURLConnection is directly dealt with is problematic. We will definitely refactor the client runtime, for the async conduit be supported better, for a local transport be supported, etc. That will have to become a priority soon enough, however we are more focused at the moment on wadl and security. In meantime, I tend to update real system tests to test such cases, ex, please copy & paste one of BookStore methods in systest/jaxrs and then add a test invocation in JAXRSClientServerBookTest. That will work with Jetty. I believe one can do a similar test even in rt/frontend/jaxrs, but I'll need to review how it can be done Cheers, Sergey
        Hide
        Ka-Lok Fung added a comment -

        Okay, that works for me. I'll work on this tonight then.

        By the way, did you have any suggestions for the unit test question I had? Is there a way I can easily/moderately add a unit test for this?

        -kl

        Show
        Ka-Lok Fung added a comment - Okay, that works for me. I'll work on this tonight then. By the way, did you have any suggestions for the unit test question I had? Is there a way I can easily/moderately add a unit test for this? -kl
        Hide
        Sergey Beryozkin added a comment -

        Hi

        Failing

        • SomeHeader: some text,"other quoted, text","blah"

        is not a problem because it's reasonable to expect multiple SomeHeader headers in such cases.

        SomeHeader: some text
        SomeHeader: "other quoted, text","blah"

        If we ever get a single line like that then we can see what can be improved. We can always have contextual properties which can tell the client which headers may need some special processing/etc

        thanks, Sergey

        Show
        Sergey Beryozkin added a comment - Hi Failing SomeHeader: some text,"other quoted, text","blah" is not a problem because it's reasonable to expect multiple SomeHeader headers in such cases. SomeHeader: some text SomeHeader: "other quoted, text","blah" If we ever get a single line like that then we can see what can be improved. We can always have contextual properties which can tell the client which headers may need some special processing/etc thanks, Sergey
        Hide
        Ka-Lok Fung added a comment -

        For my use cases, your suggestion will work:

        So would checking if the header value starts with \" and if yes then parse it using fast indexOf will work well in most/all such cases ?

        However, this is technically allowed (per RFC2616, section 2.2, see quoted-pair)

        SomeHeader: "some text, some more text with inlined \""

        What do you think of me submitting another patch that works for the following use cases:

        • SomeHeader: "some text, some more text"
        • SomeHeader: "some text","quoted,text","even more text"
        • SomeHeader: "some text, some more text with inlined \""

        but would fail for the following use case:

        • SomeHeader: some text,"other quoted, text","blah"

        Because we're doing text scanning, I'm thinking that the most efficient way to do this is actually going character by character in the string (converting to char[] first) than using indexOf. However, I could do a benchmark to see which is faster and choose that for the patch

        What do you think?

        Show
        Ka-Lok Fung added a comment - For my use cases, your suggestion will work: So would checking if the header value starts with \" and if yes then parse it using fast indexOf will work well in most/all such cases ? However, this is technically allowed (per RFC2616, section 2.2, see quoted-pair) SomeHeader: "some text, some more text with inlined \"" What do you think of me submitting another patch that works for the following use cases: SomeHeader: "some text, some more text" SomeHeader: "some text","quoted,text","even more text" SomeHeader: "some text, some more text with inlined \"" but would fail for the following use case: SomeHeader: some text,"other quoted, text","blah" Because we're doing text scanning, I'm thinking that the most efficient way to do this is actually going character by character in the string (converting to char[] first) than using indexOf. However, I could do a benchmark to see which is faster and choose that for the patch What do you think?
        Hide
        Sergey Beryozkin added a comment -

        Hi, thanks for this patch.

        I'm a bit concerned about introducing HttpHeaderImpl in there, given that some headers, notably Set-Cookie, can contain a combination of all sort of values, ex,
        Set-Cookie: bar.com.anoncart=107894933471602436; Domain=.bar.com;Expires=Thu, 01-Oct-2020 23:44:22 GMT; Path=/

        and I'm worried we could break some existing code there if we try to generalize, because HttpHeadersImpl will split by default.

        What do you think of the following approach:

        I reckon that realistically, as far as \" character is concerned, we can expect
        SomeHeader: "some text, some more text"

        or

        SomeHeader: "some text, some more text","even more text"

        but not

        SomeHeader: "some text, some more text with inlined \""
        which may not be even legal.

        So would checking if the header value starts with \" and if yes then parse it using fast indexOf will work well in most/all such cases ?

        Cheers, Sergey

        Show
        Sergey Beryozkin added a comment - Hi, thanks for this patch. I'm a bit concerned about introducing HttpHeaderImpl in there, given that some headers, notably Set-Cookie, can contain a combination of all sort of values, ex, Set-Cookie: bar.com.anoncart=107894933471602436; Domain=.bar.com;Expires=Thu, 01-Oct-2020 23:44:22 GMT; Path=/ and I'm worried we could break some existing code there if we try to generalize, because HttpHeadersImpl will split by default. What do you think of the following approach: I reckon that realistically, as far as \" character is concerned, we can expect SomeHeader: "some text, some more text" or SomeHeader: "some text, some more text","even more text" but not SomeHeader: "some text, some more text with inlined \"" which may not be even legal. So would checking if the header value starts with \" and if yes then parse it using fast indexOf will work well in most/all such cases ? Cheers, Sergey
        Ka-Lok Fung made changes -
        Field Original Value New Value
        Attachment trunk.wc_quote_minimal.diff [ 12479250 ]
        Hide
        Ka-Lok Fung added a comment -

        Here's my first attempt at a patch for this issue against trunk.

        I'm more than willing to make further changes if anyone has any feedback.

        Three specific comments about this patch from my side:

        • To leverage the work done in CXF-2462, instead of rolling our own header parsing code in WebClient, HttpHeadersImpl is used to do the header parsing.
        • This patch could be made more efficient by adding a function to HttpHeadersImpl to return a Map<String, List<String>> (see comment in patch). If you don't object to the function addition, I can update the patch with this change.
        • I was also looking into adding a unit test but it seems there's no easy way for me to do so (because the header parsing done in WebClient is only done after a physical connection to a server). If anyone has any feedback on how to add a unit test for this, I'll greatly appreciate it.
        Show
        Ka-Lok Fung added a comment - Here's my first attempt at a patch for this issue against trunk. I'm more than willing to make further changes if anyone has any feedback. Three specific comments about this patch from my side: To leverage the work done in CXF-2462 , instead of rolling our own header parsing code in WebClient , HttpHeadersImpl is used to do the header parsing. This patch could be made more efficient by adding a function to HttpHeadersImpl to return a Map<String, List<String>> (see comment in patch). If you don't object to the function addition, I can update the patch with this change. I was also looking into adding a unit test but it seems there's no easy way for me to do so (because the header parsing done in WebClient is only done after a physical connection to a server). If anyone has any feedback on how to add a unit test for this, I'll greatly appreciate it.
        Ka-Lok Fung created issue -

          People

          • Assignee:
            Sergey Beryozkin
            Reporter:
            Ka-Lok Fung
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development