Bug 34602

Summary: mod_rewrite fails to correctly deal with URLS that have escapes in them
Product: Apache httpd-2 Reporter: Michael Sinz <michael.sinz>
Component: mod_rewriteAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: RESOLVED WORKSFORME    
Severity: major CC: aren, asf-bugs, benoit-apache, brectanu, guenther.gsenger, jay.mccarthy, markus.stockhausen, mike, oldium.pro, web
Priority: P2    
Version: 2.5-HEAD   
Target Milestone: ---   
Hardware: All   
OS: All   
URL: http://svn.sinz.com/svn/example/trunk/tests/
Attachments: Test page that lets you try all of the 7-bit ASCII test cases
Escape internal redirects for 2.0.55
Adds escaping-functionality to backreferences

Description Michael Sinz 2005-04-25 17:11:56 UTC
I have a simple redirect rule that looks something like this:

  RewriteCond %{QUERY_STRING} Insurrection=rss
  RewriteRule ^svn/(.*)$ /rss.cgi/$1  [R,L]

Now, the URL given is to a file which happens to have various strange characters
in it, thus the URL is very escaped.  The URL looks like:

href="/svn/example/trunk/tests/CanThisWork&amp;amp%3bInSVN%3f/test?Insurrection=rss"

and the end result is that the %3f (which is a "?") and all after it is stripped
off.  If I have the page link directly to the target of the rewrite within the
HTML, it works.

Note that things are even worse if I try to use the [P,L] (proxy) rather than
just redirect [R,L] as then other escaped characters cause problems.

I have not yet put together a full set of test cases but the following does work
in [R,L] but not in [P,L]

href="/svn/example/trunk/tests/Really%21%7E%23$%25@%5e*%28%29&amp;%20Nasty/test?Insurrection=rss"

I will put some of the test cases on the public web site at
http://svn.sinz.com:8000/ using exactly these rewrite rules.
Comment 1 Michael Sinz 2005-04-25 18:26:43 UTC
Specifically, the example of:

http://svn.sinz.com:8000/svn/example/trunk/tests/TestCase-%3f-/test.txt?Insurrection=log

does not work while what should have been the rewritten URL does:

http://svn.sinz.com:8000/log.cgi/example/trunk/tests/TestCase-%3f-/test.txt?Insurrection=log

I have now put some specific test cases on the web site for public access at the
Subversion/Insurrection URL of http://svn.sinz.com:8000/svn/example/trunk/tests/
Comment 2 Michael Sinz 2005-04-28 01:48:58 UTC
I have now verified this with 2.0.54
Comment 3 Michael Sinz 2005-05-12 01:08:52 UTC
This is actually worse that I thought.

In the redirect case, all CGI parameters that have URL escaped characters are
munged into being double-escaped.

The "%3F" escape within the URL path part after the rewrite looks like it is no
longer escaped and now becomes the CGI introducer.

In the proxy case, the CGI parameters are fine but a number of escape codes in
the URL path part now get confused.

See the tests at http://svn.sinz.com:8000/rewrite-test/index.html

This is against Apache 2.0.54
Comment 4 Michael Sinz 2005-05-12 01:12:37 UTC
Created attachment 15000 [details]
Test page that lets you try all of the 7-bit ASCII test cases

This is the source to the test case that I have on
http://svn.sinz.com:8000/rewrite-test/index.html

The test code is also available from my Subversion server at
http://svn.sinz.com:8000/svn/Web/rewrite-test/
Comment 5 Brian 2005-05-30 21:54:20 UTC
If I remember right URLs that are rewritten will be escaped by defult. Maybe
there is still a problem above and beyond this but I didn't see you mention
tests using the "NE" option, per the apache 2 docs this is defined as:

 'noescape|NE' (no URI escaping of output)
This flag keeps mod_rewrite from applying the usual URI escaping rules to the
result of a rewrite. Ordinarily, special characters (such as '%', '$', ';', and
so on) will be escaped into their hexcode equivalents ('%25', '%24', and '%3B',
respectively); this flag prevents this from being done. This allows percent
symbols to appear in the output, as in

RewriteRule /foo/(.*) /bar?arg=P1\%3d$1 [R,NE]
which would turn '/foo/zed' into a safe request for '/bar?arg=P1=zed'.
Comment 6 Michael Sinz 2005-05-30 22:19:31 UTC
(In reply to comment #5)
> If I remember right URLs that are rewritten will be escaped by defult. Maybe
> there is still a problem above and beyond this but I didn't see you mention
> tests using the "NE" option, per the apache 2 docs this is defined as:

I know what the NE option does, but if you look at the rewrite rules, none of
the rules actually use special caracters.  However, the URL has special
characters in it.

The examples at http://svn.sinz.com/rewrite-test/index.html show this as the URL
passed into the rewrite engine has the special characters and yet the data
within the CGIs shows that something bad has happened.  The test page referenced
above shows three frames, one each of not-rewritten, R-rewritten, and
P-rewritten requests and what the CGI/Environment says is going on.

#######################################################################
RewriteRule	^redirect/(.*)$	"/rewrite-test/test.cgi/$1"	[R,L]
RewriteRule	^proxy/(.*)$	"/rewrite-test/test.cgi/$1"	[P,L]
#######################################################################

Given that all I do is take part of the URL and change it, this should
be the correct way of handling it.
Comment 7 Marcus Bointon 2005-11-03 09:50:25 UTC
(In reply to comment #6)
> RewriteRule	^redirect/(.*)$	"/rewrite-test/test.cgi/$1"	[R,L]
> 
> Given that all I do is take part of the URL and change it, this should
> be the correct way of handling it.

I'm seeing exactly the same thing, for the same purpose, and also NE makes no difference - I have a 
redirect that accepts a url embedded with the url and passes it to a PHP script which then redirects to 
the embedded URL. I'd describe the problem differently - it's as if the matched subpattern has 
urldecode applied to it before it is passed to the output pattern.

I'm trying to avoid this being a me-too report, so here is a thoroughly unpleasant workaround: if you 
double-urlencode the incoming parameter, you end up with the string you were expecting in the output 
pattern. Here's what I want to be passing in:

http://www.example.com/u/http%3A%2F%2Fwww.apache.org%2F

This rule handles it:

RewriteRule ^u/(.*) redirect.php?url=$1 [R,L]

mod_rewrite generates an invalid URL:

redirect.php?url=http://www.apache.org/

If I double-urlencode the embedded URL:

http://www.example.com/u/http%253A%252F%252Fwww.apache.org%252F

I get:

redirect.php?url=http%3A%2F%2Fwww.apache.org%2F

which works, but undermines much of the point of having nice tidy mod_rewrite URLs.

Is this bug really this straightforward? I can't think of a circumstance where you'd want this behaviour 
and it's so simple that I can't believe that this has not been encountered before. I'm running 2.0.54 on 
MacOS X and 2.0.52 on RHEL 4 and both are acting this way.
Comment 8 Marcus Bointon 2005-11-05 10:37:04 UTC
*** Bug 36986 has been marked as a duplicate of this bug. ***
Comment 9 Michael Sinz 2005-11-18 15:47:04 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > RewriteRule	^redirect/(.*)$	"/rewrite-test/test.cgi/$1"	[R,L]
> > 
> > Given that all I do is take part of the URL and change it, this should
> > be the correct way of handling it.
> 
[...]
> I'm trying to avoid this being a me-too report, so here is a thoroughly
> unpleasant workaround: if you double-urlencode the incoming parameter,
> you end up with the string you were expecting in the output 
> pattern. Here's what I want to be passing in:
[...]

It is unacceptable to try to double-URL encode the input as the input is
the valid public URL that a web browser may request.  So how do I cause
the input to be double encoded?  And how do I deal with the strange
side-effects of that in the CGI scripts?  (And what of when the data
itself has to be URL encoded but the script is called directly?)

Anyway, to me this seems like a significant problem.  My workaround is
rather nasty (see the Insurrection project) and requires some tricks
in both the rewrite rule and the way the CGI parameters are processed
(and thus reprocessed after applying the fixup code)

A very ugly workaround that does not really cover the general case.
Comment 10 Marcus Bointon 2005-11-18 16:37:14 UTC
I did say that it was unpleasant. It's obviously not a general workaround for public URLs, but if you are in 
control of both generating and receiving the URLs (as I am in my current context), it's quite workable. 
Since it's obviously broken, there's not a lot else we can do until it's fixed, at which point I'll remove my 
double encoding. Ugly but workable beats just plain broken every time.

There is no problem with CGIs - they should expect their parameters to be URL encoded - I know PHP 
automatically decodes all parameters, however, bear in mind that that means that it may corrupt input 
data because it doesn't know that mod_rewrite has already done a decoding pass. If this double decoding 
has not affected you (e.g. because your input strings don't contain %), then you're just lucky.
Comment 11 Oldrich Jedlicka 2006-02-02 18:29:41 UTC
Created attachment 17573 [details]
Escape internal redirects for 2.0.55

This patch escapes internal redirect requests (just before the message
"internal redirect with..."). This is logical as it seems that the redirection
is fully processed again (unescaped and so on).
Comment 12 Oldrich Jedlicka 2006-02-02 18:31:05 UTC
Example of successful rewrite for URL http://karel.oldium.home/%/ into  
http://karel.oldium.net/%/  
   
Rule:   
   
RewriteCond %{HTTP_HOST} ^karel.oldium.home$   
RewriteRule (.*) http://karel.oldium.net/$1 [last]   
   
Log:   
   
...(4) RewriteCond: input='karel.oldium.home' pattern='^karel.oldium.home$' =>   
matched   
...(2) [per-dir /var/www/oldium.home/www/] rewrite %/ ->   
http://karel.oldium.net/%/   
...(2) [per-dir /var/www/oldium.home/www/] implicitly forcing redirect (rc=302)   
with http://karel.oldium.net/%/   
...(2) [per-dir /var/www/oldium.home/www/] trying to replace   
prefix /var/www/oldium.home/www/ with /   
...(1) [per-dir /var/www/oldium.home/www/] escaping http://karel.oldium.net/%/   
for redirect   
...(1) [per-dir /var/www/oldium.home/www/] redirect to   
http://karel.oldium.net/%25/ [REDIRECT/302]   
   
This is a correct redirect. The response from Apache to my browser is:   
   
HTTP/1.1 302 Found   
Date: Thu, 02 Feb 2006 16:51:23 GMT   
Server: Apache   
Location: http://karel.oldium.net/%25/   
...   
<p>The document has moved <a href="http://karel.oldium.net/%25/">here</a>.</p>   
...   
Comment 13 Oldrich Jedlicka 2006-02-02 18:34:10 UTC
Example of local rewrite for URL http://karel.oldium.home/%/ to local /karel/%/ 
 
Rule: 
 
RewriteCond %{HTTP_HOST} ^karel.oldium.home$ 
RewriteRule (.*) /karel/$1 [last] 
 
Log: 
 
...(4) RewriteCond: input='karel.oldium.home' pattern='^karel.oldium.home$' => 
matched 
...(2) [per-dir /var/www/oldium.home/www/] rewrite %/ -> /karel/%/ 
...(2) [per-dir /var/www/oldium.home/www/] trying to replace 
prefix /var/www/oldium.home/www/ with / 
...(1) [per-dir /var/www/oldium.home/www/] escaping /karel/%/ for redirect 
...(1) [per-dir /var/www/oldium.home/www/] internal redirect with /karel/%25/ 
[INTERNAL REDIRECT] 
.../redir#1] (3) [per-dir /var/www/oldium.home/www/] add path info 
postfix: /var/www/oldium.home/www/karel/% -> /var/www/oldium.home/www/karel/%/ 
.../redir#1] (3) [per-dir /var/www/oldium.home/www/] strip per-dir 
prefix: /var/www/oldium.home/www/karel/%/ -> karel/%/ 
... continues ... 
 
The percent sign is handled correctly. 
Comment 14 Mike Weller 2006-06-08 13:57:12 UTC
Hi there. Just spent an hour or so looking at the mod_rewrite source.
Unfortunately it looks like apache passes the module the path/filename part of
the url as already unescaped.

There is a workaround to reverse the unescaping, but you still can't use '/'
(%2F) because it is already decoded by the time mod_rewrite gets it, and there's
no way to know whether it was escaped or not in the original url.

I hacked together a messy fixurl(...) function to re-encode '=', '&', '#' etc.,
then applied that to the uri variable in function 

int apply_rewrite_rule(...)

Just before

    rc = (ap_regexec(regexp, uri, AP_MAX_REG_MATCH, regmatch, 0) == 0);
    if (! (( rc && !(p->flags & RULEFLAG_NOTMATCH)) ||
           (!rc &&  (p->flags & RULEFLAG_NOTMATCH))   ) ) {
        return 0;
    }

Like I say my code is a hack, so I'll leave it up to someone else to provide a
better fix/patch.
Comment 15 Jordan Mendelson 2006-08-12 03:36:43 UTC
I'm showing the same problem with 2.0.55. The following fails:

RewriteRule ^/a/(.+)$       http://www.example.com/b/$1 [R,L]

This causes the URL:

http://localhost/a/where%3f/get?id=1

To be mapped to:

http://www.example.com/b/where?/get

instead of:

http://www.example.com/b/where%3f/get?id=1

Comment 16 Francis Daly 2006-10-11 03:39:20 UTC
(In reply to comment #15)

> RewriteRule ^/a/(.+)$       http://www.example.com/b/$1 [R,L]
> 
> This causes the URL:
> 
> http://localhost/a/where%3f/get?id=1
> 
> To be mapped to:
> 
> http://www.example.com/b/where?/get
> 
> instead of:
> 
> http://www.example.com/b/where%3f/get?id=1

RewriteMap esc int:escape

RewriteRule ^/a/(.+)$       http://www.example.com/b/${esc:$1} [R,L,NE]

Given URL?QS, core unescapes URL but not QS, and rewrite escapes both the
URL and QS that it gets. So [NE] prevents rewrite doing the escaping, and
the RewriteMap causes it to escape the URL it gets, but not the QS it gets.

I'm not sure it's *right*, but it seems to work for me, up to and including
2.0.59.
Comment 17 Bob Ionescu 2007-01-25 19:24:07 UTC
(In reply to comment #16)
> So [NE] prevents rewrite doing the escaping, and
> the RewriteMap causes it to escape the URL it gets, but not the QS it gets.
> 
> I'm not sure it's *right*

Yes, I think so, but this is another problem not directly related to the problem
described here (rewriting rule-pattern to query string). The current behavior is
imho wrong. If we force a redirect, the query string should remain untouched
from any escaping intended for uri-paths, because this modifies the query string
in an unexpected way. A qs like 'foo%26bar' (location header /foo?foo%2526bar)
results in foo%2526bar, which isn't equal to the original query string any more,
while a uri like /foo%bar (location header: /foo%25bar) results in /foo%bar. But
some specific characters within the query string must be escaped, though (such
as spaces).

(In reply to comment #5)
>  'noescape|NE' (no URI escaping of output)

Yes, the NE flag prevents that, but the uri-path may be invalid now since this
prevents the URL-path being escaped, too.


(In reply to comment #11)
> Escape internal redirects for 2.0.55
> 
> This patch escapes internal redirect requests (just before the message
> "internal redirect with..."). This is logical as it seems that the redirection
> is fully processed again (unescaped and so on).

Yes, this would be logical. The main difference is that this doesn't touch the
query string.
Comment 18 Michael 2007-03-11 17:02:17 UTC
This bug is a killer for me using PHP and it`s URLENCODE function.
Basically this encodes a space as a literal '+' in the url and escapes a literal
'+' as %2b, the problem is that once we hit the RewriteRule the space is still
encoded as a literal '+' and the literal '%2b' is decoded to be a literal '+'
aswell.
As you can imagine the RewriteMap solution dosen`t work and I`m left with no
solution but to double encode which is horrible.
Is there a reason that one must decode the hex entities before the use of the
RewriteRules and is it due to the 'being a path' way of thinking as alot more
URLs are not only used as a path to a resource but to pass information aswell.

This is what i`d like to see:

# accept a-zA-Z and %2b(escaped '+')
RewriteRule ^resource/([a-z]|%2b)+$ /resource.ext?data=$1 [NC]

This would still fail on say '/resource/info%' as it`s not the sequence %2b etc
and would use the first matching rule for something like:

RewriteRule ^resource/([a-z]|%2|%2b)+$ /resource.ext?data=$1 [NC]
'/resource/%2'.

I`d love to hear everyone's opinion on this as I`m not sure if it would be the
correct way to handle it or if it would lead to security concerns etc,
If there is agreement I`ll have a stab at implementing it and see where it
leads, if it is fundamentaly wrong and you have some resources I would love to
know that too.
Thanks 
Michael
Comment 19 Bob Ionescu 2007-03-16 14:29:35 UTC
(In reply to comment #18)
> Basically this encodes a space as a literal '+' in the url and escapes a literal
> '+' as %2b, the problem is that once we hit the RewriteRule the space is still
> encoded as a literal '+' and the literal '%2b' is decoded to be a literal '+'
> aswell.

Rewriting URI-paths into the QueryString isn't safe - both have different rules
for encoding. This creates problems, if you mix both together.

But you can process $_SERVER['REQUEST_URI'] within php or catch the variables
from the unprased ENV THE_REQUEST with a RewriteCond.
Comment 20 Ulf M 2007-03-21 10:04:33 UTC
I think this bug describes the same problem as 23295.
Comment 21 Rich Bowen 2007-03-21 10:16:15 UTC
You need to use the [NE] (NoEscape) flag in order to disable this escaping behavior.
Comment 22 Rich Bowen 2007-03-21 10:25:01 UTC
I'm sorry. I didn't read the entire history of the bug. Reopening on the chance
that someone else knows more about this than I do.
Comment 23 Michael Sinz 2007-03-21 10:56:16 UTC
From reading bug 23295, I would say that it is related but not the same problem.
 In this case, the problem is that in this case, the escaping is done too much
(as in escaping characters in the query string) while in the bug 23295 case, the
escaping was not enough (as in the URI part).  (Or, if you use NE, part of the
URI is escaped but then part is not and we once again get failure)
Comment 24 Mike Weller 2007-03-21 11:06:15 UTC
First: wow, I didn't know this bug was still open...

(In reply to comment #19)
> But you can process $_SERVER['REQUEST_URI'] within php or catch the variables
> from the unprased ENV THE_REQUEST with a RewriteCond.

Yeah, this is the method I went with in the end... and Mediawiki does the same
thing for those that are curious.

As for the correct behaviour... well, as has been mentioned, the rules for
escaping are slightly different between the path and query part of the url.

say you have

RewriteRule ^(.*)$ pages/test.php?s=$1 [L]

If you use the url

/1&a=1

or

/1%26a%3d1

 the resulting internal url ends up as pages/test.php?s=1&a=1, in other words
mod_rewrite is parsing the path part after it has been decoded, and then doing a
direct copy into the query, without re-encoding it.

So the question is, should mod_rewrite parse the urls before or after url
decoding (maybe apache decodes before the url is passed to mod_rewrite?), and
should it re-encode data when it is copied to the query, or leave that up to the
script?
Comment 25 Bob Ionescu 2007-03-21 13:22:44 UTC
(In reply to comment #24)
> So the question is, should mod_rewrite parse the urls before or after url
> decoding (maybe apache decodes before the url is passed to mod_rewrite?),

You might want to read
http://issues.apache.org/bugzilla/show_bug.cgi?id=32328#c12 where I tried to
explain how mod_rewrite's processing within the directory context works.
Comment 26 Michael 2007-03-22 02:03:26 UTC
Thanks for all the input, now knowing how to get around this and what is the
likely reason has helped me out of my deranged hysteria for another day.
I`d like to ask though does anyone have a pointer to some information as to why
this ambiguous behavour is implememnted e.g what security concerns are there for
paths etc as this all has me wondering about the validity of using hex coded
entities in a SEF style URL (are there other uses that require said query
string/path mangling ?).
Keep up the great work!
Michael 
Comment 27 Guenther Gsenger 2007-05-18 10:46:16 UTC
Created attachment 20217 [details]
Adds escaping-functionality to backreferences

This patch adds a new flag to RewriteRule statement:
Adding the flag [B] (or [backrefescaping]) forces mod_rewrite to escape
backreferences in the rewrite target.
E.g. 
RewriteRule ^(.*)$   index.php?show=$1	[B,L]
In the given example, a request to http://example.com/C++ (or
http://example.com/C%2B%2B) would be redirected internally to
index.php?show=C%2B%2B instead of index.php?show=C++
Comment 28 Michael 2007-05-18 19:37:10 UTC
Just to put my work around for my PHP problem with amiguous escaping with '+' 
signs in a rewrite rule here so someone might find it useful, thanks to Bob 
Ionescu and Mike Weller for their leads. 

// in the .htaccess file or vhost
// accept letters, plus signs and encoded plus signs
RewriteCond %{THE_REQUEST} /test/(([a-z]|%2b|\+|)+)*/? [NC]
RewriteRule . test.php?cat=%1 [NE,L] 

// php code for test.php
<?
print_r($_GET);
?>

// URL with encoded spaces which are +'s
www.domain.com/test/c++stuff
// gives
array( [cat] => c stuff) // multiple plus signs are decoded to 1 space by PHP

// URL with encoded +'s
www.domain.com/test/c%2b%2bstuff
// gives
array( [cat] => c++stuff ) // correctly decodes an encoded +

Hope this helps someone.
Comment 29 ceetee+issues.apache.org 2007-08-23 16:38:34 UTC
Some of the comments here seem to suggest that all this is the expected
behavior. Well, I, for one, don't get it. Allow me to elaborate on my experience
with this bug.

On my site, I direct searches through /search/ followed by the search query, and
then another trailing slash. The rule I use is

  RewriteRule ^search(/(.+))?/$ /index.php?page=search&query=$2 [L]

This works fine for queries that don't contain a slash. If I were to search for
"9/10", for example, the requested path would become

  /search/9%2F10/

Now, from the discussion here, I gather that that won't work. And indeed, double
encoding fixes it. I hate that solution, personally, but, more importantly:

1. The error I get is a 404. First of all, the content of that error page has
the %2F decoded to /, which I don't fully get. But the really weird thing is
that my ErrorDocument 404 applies in all cases except this one--I get a standard
black on white 404 page for some reason.

2. Since escaping is the problem, allegedly, it should work if I omit the query
from the eventual URL, right? However, even the trivial

  RewriteRule ^search /index.php [L]

fails to match /search/9%2F10/.

I realize this is not a help forum, but I would sure appreciate some input. I'm
sorry if this is a different bug, but it seemed related.
Comment 30 Joshua Slive 2007-08-24 07:54:54 UTC
Getting 404 for %2F indicates that you need to look at the AllowEncodedSlashes
directive. (I have nothing to say about other issues reported here.)
Comment 31 cuong 2007-09-08 03:10:47 UTC
I've spending the whole day debugging mod_rewrite and finally found this bug.
Has it been fixed and incorporated into the latest release? It's quite annoying
to have such a bug in the most widely used HTTP server on the net.
Comment 32 Nick Kew 2007-09-08 04:37:10 UTC
*** Bug 39746 has been marked as a duplicate of this bug. ***
Comment 33 Nick Kew 2007-09-08 05:47:59 UTC
Patch from comment #27 committed to /trunk/ in 0
Comment 34 Mike Weller 2007-09-08 12:05:46 UTC
(In reply to comment #33)
> Patch from comment #27 committed to /trunk/ in r573831

It's taken over 2 years for this to be resolved. The power of open source, eh?
Comment 35 Nick Kew 2007-09-08 12:19:24 UTC
(In reply to comment #34)
> (In reply to comment #33)
> > Patch from comment #27 committed to /trunk/ in 0
> 
> It's taken over 2 years for this to be resolved. The power of open source, eh?

The patch has been around for longer.

Your option to fix it yourself, or pay someone to fix it, or work around it, has
always been around.
Comment 36 Nick Kew 2007-09-10 05:06:38 UTC
*** Bug 23295 has been marked as a duplicate of this bug. ***
Comment 37 Jonathan Rochkind 2007-10-17 12:57:24 UTC
I'm not sure if people are succesfully convincing other people this IS a bug.
Let's take another example:

Incoming URL:
/foo?bar=%3Abaz

This is a perfectly legal URL right? The "%3A" is a perfectly legally encoded
"/" char---that is the way it OUGHT to be included. 

Now let's say I want to redirect all /foo urls to an external server:

RewriteRule /foo http://somewhere.else.com/other [R]

Expected behavior, redirect to:
http://somewhere.else.com/other?bar=%3Abaz

Yes?

ACTUAL behavior, redirect to:
http://somewhere.else.com/other?bar=%253Abaz

Some of you are arguing that this is intended behavior? How can this possibly
be? I got a perfectly legal URL in with a perfectly legal query string. My
RewriteRule should be expected to leave the query string exacty intact, right?
Yet it corrupts it to mean something else. 

To me, this is obviously a bug. [And one that's causing me a serious probelm at
the moment to boot]. 
Comment 38 Michael Sinz 2007-10-17 19:07:13 UTC
I know that I don't think this is correct behavior and that some of the
discussion here seems to have missed the point of the rewrite problem that I
initially reported.
Comment 39 Cory Rustoferson 2007-10-18 09:02:45 UTC
(In reply to comment #35)
> (In reply to comment #34)
> > (In reply to comment #33)
> > > Patch from comment #27 committed to /trunk/ in 2
> > 
> > It's taken over 2 years for this to be resolved. The power of open source, 
eh?
> 
> The patch has been around for longer.
> 
> Your option to fix it yourself, or pay someone to fix it, or work around it, 
has
> always been around.

As Jonathan Rochkind stated above:

Incoming URL:
/foo?bar=%3Abaz

Expected behavior, redirect to:
http://somewhere.else.com/other?bar=%3Abaz

ACTUAL behavior, redirect to:
http://somewhere.else.com/other?bar=%253Abaz

I think mod_rewrite should not reencode unless I tell it to, or at the very 
least, let me tell it not to.

If a patch exists and has not been released then what kind of money are we 
talking here to get it fixed in a major relase?  I don't have a job but i'd be 
willing to put a few dollars towards getting this fixed.
Comment 40 Marcus Bointon 2007-10-22 01:48:19 UTC
(In reply to comment #39)
> I think mod_rewrite should not reencode unless I tell it to, or at the very 
> least, let me tell it not to.

I quite agree. Those that say that it's correct should be campaigning for a documentation change saying "it's not 
possible to pass URL-unsafe parameters (i.e. those that require urlencoding) through mod_rewrite". I suspect that the 
vast majority of tutorials, documentation and articles about mod_rewrite are broken by this bug - the only reason they 
work as they are is pure luck and simplistic examples.
 
> If a patch exists and has not been released then what kind of money are we 
> talking here to get it fixed in a major relase?  I don't have a job but i'd be 
> willing to put a few dollars towards getting this fixed.

Me too. Without a patch the choices are : don't use apache, don't use mod_rewrite or (shiver) double encode 
everything. I have another workaround that's workable at the moment - instead of urlencoding params, I base64-
encode them. Really ugly, but it works.
Comment 41 Marcus Bointon 2007-10-22 01:51:25 UTC
(In reply to comment #18)
> This bug is a killer for me using PHP and it`s URLENCODE function.
> Basically this encodes a space as a literal '+' in the url and escapes a literal
> '+' as %2b

There's an easy workaround for that - use rawurlencode() instead which encodes spaces as %20 instead of +.
It will still suffer from this bug if the string contains any params that get rewritten.
Comment 42 Nick Kew 2007-10-22 03:07:57 UTC
Folks - we know all about this bug, and it still needs someone to find time to
tidy up the patch.  See http://marc.info/?t=118925575100001&r=1&w=2 for why the
existing patch isn't considered quite good enough.
Comment 43 Jonathan Rochkind 2007-10-22 11:48:06 UTC
Awesome, thanks. Reassuring. 
Comment 44 Nick Kew 2007-10-29 06:20:16 UTC
Fixed in 0
Comment 45 Nick Kew 2007-11-11 16:20:38 UTC
*** Bug 42610 has been marked as a duplicate of this bug. ***
Comment 46 Mårten Berglund 2008-08-08 04:49:42 UTC
This bug doesn't seem to be fixed after all. See

https://issues.apache.org/bugzilla/show_bug.cgi?id=45529
Comment 47 Mårten Berglund 2008-08-20 13:49:32 UTC
Log now confirms the bug is still there - see again bug 45529
Comment 48 alada 2010-10-09 18:08:11 UTC
Here is the application of above php solution, if your web host still has an Apache with this bug.
(The bug where an external htaccess redirect double encodes url parameters)

in .htaccess:
RewriteCond ...
RewriteRule .... /phplist_redirect10.php [L]

in the file phplist_redirect10.php:

<?php
header(
'Location: http://'
. $_SERVER['SERVER_NAME']
. preg_replace
    (
      '/^\/myfolder([\-_a-zA-Z0-9]+)\/(.*)$/'
     ,'/myfolder/$1/$2'
     ,$_SERVER['REQUEST_URI']
     ) 
,TRUE
,301  //301 for permanent redirect, 303 for temporary redirect
);


$_SERVER['REQUEST_URI']: the url as written in browser bar, contains first slash. like /myfolder/myfolder/some.php?a=5#bcd
Comment 49 Aren Cambre 2011-01-20 10:17:34 UTC
Per last few comments, this is still a problem.

Comment 39 puts it well:

> I think mod_rewrite should not reencode unless I tell it
> to, or at the very least, let me tell it not to.

Until the "let me tell it not to" is implemented, this needs to stay open.

I'm running across this double-encoding problem on a proxied Perl app.
Comment 50 Eric Covener 2011-01-20 10:50:09 UTC
[NE] is required to not escape the substutition, whether you include characters that need escaping in-line or via a backreference. 

Additionally, query strings that aren't modified are no longer escaped in 2.3 [this is one of the followup bug reports that should have been a separate bug]

Please open separate bugs for separate rewrite issues if you'd like them reconsidered.  I'd suggest even if you want to re-open this bug, you instead open a new bug with less baggage.
Comment 51 Eric Covener 2011-01-20 10:56:55 UTC
changing disposition to WORKSFORSOME, bug too muddled for a proper closing code.