Tapestry 5
  1. Tapestry 5
  2. TAP5-302

URL encoded strings that contain symbols such as %2f (encoded "/") are decoded incorrectly in some environments

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0.16
    • Component/s: None
    • Labels:
      None

      Description

      If an activation context variable contains %2f (URL encoded /) it is interpreted as / and therefore interpreted as a separation of context variables.

      example:

      http://www.example.com/test/one/t%2fwo

      I expect the above URL to contain three context variables nl: test, one and t/wo but Tapestry thinks it contains four context variables nl: test, one, t and wo

      This makes it impossible to use Base64 encoded context variables because they can contain the / symbol (which is gets URL encoded to %2f)

        Issue Links

          Activity

          Hide
          Martijn Brinkers added a comment -

          If I return a String containing / from onPassivate / gets converted to %252F which is URL encoded version of %2f. I think this is not what it should be because other characters are not double encoded. For example returning test@example.com in onPassivate results in test%40example.com.

          Show
          Martijn Brinkers added a comment - If I return a String containing / from onPassivate / gets converted to %252F which is URL encoded version of %2f. I think this is not what it should be because other characters are not double encoded. For example returning test@example.com in onPassivate results in test%40example.com.
          Hide
          Martijn Brinkers added a comment -

          Looking TapestryInternalUtils it seems this is intended behavior. The problem imho, is that it is non-standard encoding which makes it harder (and more error prone) to interface non-tapestry code with tapestry because instead of using standard URL encoding the caller should now use Tapestry 'special' URL encoding.

          Show
          Martijn Brinkers added a comment - Looking TapestryInternalUtils it seems this is intended behavior. The problem imho, is that it is non-standard encoding which makes it harder (and more error prone) to interface non-tapestry code with tapestry because instead of using standard URL encoding the caller should now use Tapestry 'special' URL encoding.
          Hide
          Fernando Padilla added a comment -

          I opened a bug on this behavior before, but did not get any traction. (I reported it as an issue between jetty and tomcat, since they seem to have different behavior).

          I really really recommend that tapestry uses a different string encoding other than URL Encoding to encode the activation context. It may be "non standard", but it will guarantee that you can send characters you want through activation context without worrying that it will be trashed.

          So I support you continueing to put pressure to get this fixed in the general case!

          Show
          Fernando Padilla added a comment - I opened a bug on this behavior before, but did not get any traction. (I reported it as an issue between jetty and tomcat, since they seem to have different behavior). I really really recommend that tapestry uses a different string encoding other than URL Encoding to encode the activation context. It may be "non standard", but it will guarantee that you can send characters you want through activation context without worrying that it will be trashed. So I support you continueing to put pressure to get this fixed in the general case!
          Hide
          Fernando Padilla added a comment -

          At the moment, we have an activation context that we are passing a path ( "blah/blah" ), and I have to do my own encoding/decoding, so that the presence of a "/" won't corrupt the whole activation context.

          Show
          Fernando Padilla added a comment - At the moment, we have an activation context that we are passing a path ( "blah/blah" ), and I have to do my own encoding/decoding, so that the presence of a "/" won't corrupt the whole activation context.
          Hide
          Martijn Brinkers added a comment -

          Another problem with context encoding is that + is not converted to a space (' '). Java's URLEncoder encoded a space as +. I would expect that Tapestry converts the + to a space. Encoding a space as a + is quite often used so I think Tapestry should support it.

          Personally I think that Tapestry should be compliant with the normal way to encode URLs and not use it's own flavor

          Show
          Martijn Brinkers added a comment - Another problem with context encoding is that + is not converted to a space (' '). Java's URLEncoder encoded a space as +. I would expect that Tapestry converts the + to a space. Encoding a space as a + is quite often used so I think Tapestry should support it. Personally I think that Tapestry should be compliant with the normal way to encode URLs and not use it's own flavor
          Hide
          Howard M. Lewis Ship added a comment -

          I think the core problem is that Jetty and Tomcat do different things and we can't rely on correct behavior. I think we'll come up with our own, simple encoding system that will not be affected by whether Tomcat does or does not uudecode it.

          Show
          Howard M. Lewis Ship added a comment - I think the core problem is that Jetty and Tomcat do different things and we can't rely on correct behavior. I think we'll come up with our own, simple encoding system that will not be affected by whether Tomcat does or does not uudecode it.
          Hide
          Martin Grotzke added a comment -

          AFAICS from svn - "%" is now custom-encoded as "$00", so that "/" becomes "$002f", or " " becomes "$0020". If that's not true please correct me

          Show
          Martin Grotzke added a comment - AFAICS from svn - "%" is now custom-encoded as "$00", so that "/" becomes "$002f", or " " becomes "$0020". If that's not true please correct me
          Hide
          Howard M. Lewis Ship added a comment -

          Stand corrected:

          The $ is an escape code that can be followed by:

          N for null
          B for blank
          $ for a literal '$'
          four hex digits - any other unicode character

          Show
          Howard M. Lewis Ship added a comment - Stand corrected: The $ is an escape code that can be followed by: N for null B for blank $ for a literal '$' four hex digits - any other unicode character

            People

            • Assignee:
              Howard M. Lewis Ship
              Reporter:
              Martijn Brinkers
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development