Bug 50267 - mod_mbox: % in message id, needs escaping as %25
Summary: mod_mbox: % in message id, needs escaping as %25
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_mbox (show other bugs)
Version: 2.5-HEAD
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-13 15:34 UTC by jeremy carroll
Modified: 2013-05-21 19:51 UTC (History)
3 users (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jeremy carroll 2010-11-13 15:34:53 UTC
I was trying to understand why in this archive
http://mail-archives.apache.org/mod_mbox/incubator-general/201011.mbox/browser 

messages from Mattmann, Chris A (388J) do not load, such as the second.

Using firefox, if I open link in new tab, I get:
http://mail-archives.apache.org/mod_mbox/incubator-general/201011.mbox/ajax/%3CC8F432CA.238DB%Chris.A.Mattmann@jpl.nasa.gov%3E

which has the obvious perecent escape problem of not having escaped the 
% in the message id.

Fix this and

http://mail-archives.apache.org/mod_mbox/incubator-general/201011.mbox/ajax/%3CC8F432CA.238DB%25Chris.A.Mattmann@jpl.nasa.gov%3E



returns appropriate XML content

c.f.
http://mail-archives.apache.org/mod_mbox/incubator-general/201011.mbox/%3CC8F432CA.238DB%25Chris.A.Mattmann@jpl.nasa.gov%3E


The headers say:
Server: Apache/2.3.8 (Unix) mod_ssl/2.3.8 OpenSSL/1.0.0a
Comment 1 jeremy carroll 2010-11-13 15:48:24 UTC
It seems more complicated:

examine the source for 
http://mail-archives.apache.org/mod_mbox/incubator-general/201011.mbox/%3CC8F432CA.238DB%25Chris.A.Mattmann@jpl.nasa.gov%3E

at the end is:

<li>Unnamed multipart/alternative (inline, None, 0 bytes)</li>
<ul>
<li><a rel="nofollow" href="/mod_mbox/incubator-general/201011.mbox/raw/<C8F432CA.238DB%Chris.A.Mattmann@jpl.nasa.gov>/1">Unnamed text/plain</a> (inline, Quoted Printable, 5490 bytes)</li>
</ul>
</ul>
</td>
</tr>

   <tr class="raw">
    <td class="left"></td>
    <td class="right"><a href="/mod_mbox/incubator-general/201011.mbox/raw/%3cC8F432CA.238DB%25Chris.A.Mattmann@jpl.nasa.gov%3e" rel="nofollow">View raw message</a></td>
   </tr>
   </tbody>
  </table>
 </body>
</html>

Notice the first link is bad (and 400s), the second is good (and 200s).
So the subtleties in %-escaping % seem to have defeated the programmer here.
Comment 2 Stefan Fritsch 2011-10-21 22:45:29 UTC
This should be fixed in r1187588, but I am waiting with closing the bug until that is installed on mail-archives.apache.org
Comment 3 Stefan Fritsch 2011-10-26 19:49:26 UTC
This is fixed now on mail-archives.apache.org
Comment 4 Stefan Fritsch 2011-10-26 19:52:03 UTC
*** Bug 52039 has been marked as a duplicate of this bug. ***
Comment 5 Sebb 2011-11-02 13:17:50 UTC
The Chris Mattmann mails no longer open in Firefox or Chrome in browse mode.

Has the fix been lost?
Comment 6 Sebb 2011-11-10 14:26:12 UTC
Still fails in Chrome 15.0.874.106 m

But has started working in Firefox 8.0.
This appears to be because that fixes the bad characters in the URL.

So it looks like the fix is still lost.
Comment 8 Stefan Fritsch 2012-04-14 10:24:23 UTC
Fixed with the merge of the convert-charset branch, r1326076
Now live on mail-archives.apache.org
Comment 9 Daniel Shahaf 2013-05-21 01:18:48 UTC
On mail-archives.apache.org, we now see a message that requires double-escaping:

http://mail-archives.apache.org/mod_mbox/oodt-dev/201304.mbox/%3CCD80E8CA.CEA8%2525Daniel.J.Crichton@jpl.nasa.gov%3E

At a guess, that is because the original message-id contains a percent sign followed by two hex digits.  (In the original example, the first two letters of "Chris" are not both hex digits.)

See: https://issues.apache.org/jira/browse/INFRA-6244
Comment 10 Rainer Jung 2013-05-21 19:51:55 UTC
Remaining case fixed with r1484915.

The msgId was percent decoded twice in the module, so that a percent sign in the msgID resulted in trouble.

Since the msgID was derived from path_info which was already percent decoded, there was no reason to percent decode the msgID itself again.

All links including a msgID should have already been single encoded before this commit so work well with single decoding.

Code is running on US instance of mail-archives.apache.org. Will update the EU instance during the next few hours as well.