Summary: | dangerous PCRE patterns in *Match directives | ||
---|---|---|---|
Product: | Apache httpd-2 | Reporter: | Christoph Anton Mitterer <calestyo> |
Component: | Documentation | Assignee: | HTTP Server Documentation List <docs> |
Status: | RESOLVED LATER | ||
Severity: | enhancement | CC: | takashi.asfbugzilla, verde |
Priority: | P2 | Keywords: | MassUpdate |
Version: | 2.2.22 | ||
Target Milestone: | --- | ||
Hardware: | All | ||
OS: | All |
Description
Christoph Anton Mitterer
2012-06-29 01:31:52 UTC
Just noted, for DirectoryMatch (but not for the others, e.g. LocationMatch, or AliasMatch).... it makes obviously no sense to handle the case of multiple trailing "/" e.g. /var////www/public/// These are likely anyway collapsed by either Apache internally or the OS. Or no? I just noted that for the same reasons as in comment #1, and again also just for Directories, but not for locations/aliases etc. "(?:$|/)" should be replaceable by: "/" (also matching subdirectories and "(?:$|/$)" by: "(/$)" (not matching subdirs). However, even the first one doesn't work with apache2.2 When I give e.g. the pattern "^/foo/" i can open the URI /foo/ /foo/x but not: /foo why?? Can't follow. How about providing an update in patch form? Reclassifying as 2.2 based on comments. Hi Eric. Where exactly do you get stuck? Generally, I see two "problems": a) Documentation should be improved, to better educate users what they're doing there. I could write patches for this. b) I don't understand why e.g. <DirectoryMatch ^/some/path/a/> #note the trailing slash Order Allow,Deny Allow From All Options +indexes </DirectoryMatch> let's me access via the URL: http://somehost.org/a/ but NOT http://somehost.org/a http://somehost.org/a/foo => works, as expected http://somehost.org/aFOO => doesn't work, as expected Regarding the rebasing: (a) applies to 2.4, too (I was working on that's documentation) (b) I'll set up some 2.4 installation to check whether that happens there, too I agree with Christoph about the documentation problems for these directives. They're not wrong, but some additional warning must be made to avoid "over-matching". The problem is that regex is *always* a partial match, but the non-regex counterparts do a full match. This makes a huge difference. <Files "image.png"> Matches image.png, but not myimage.png nor image.png.zip. <FilesMatch "image\.png"> Matches image.png, myimage.png, image.png.zip, image.png/foo, ... This partial match is not expected by the user, since the non-regex directive does not work that way. It's important to make this distinction very clear in the docs, in all the path-related directives. And also encourage the use of anchors and slashes to avoid the undesired partial matches. <FilesMatch "/image\.png$"> Matches image.png only, in any folder. <FilesMatch "^/image\.png$"> Matches image.png only, in root folder. Slashes and $ are tricky in folder-related directives, such as in <DirectoryMatch>. <Directory "foo"> Matches folder foo, in any folder <DirectoryMatch "foo"> Matches folders foo, foobar, myfoo, … in any folder <DirectoryMatch "/foo"> Matches folders foo, foobar, … in any folder <DirectoryMatch "/foo/"> Matches folder foo, in any folder <DirectoryMatch "^/foo/"> Matches root folder foo, *and all its subfolders*, because of the partial match. <DirectoryMatch "^/foo/$"> Matches root folder foo (only works in v2.4, see Bug 49809) <DirectoryMatch "/foo/$"> Matches folder foo, as the last path component, in any folder (only works in v2.4, see Bug 49809) For the user, it's difficult to understand all these subtle differences without examples and proper explanation. MY SUGGESTION Since the partial match is the great culprit for the confusion, my suggestion for the docs is to update all the examples to use full anchored regexes, with ^ and $, and encourage the user to *always* do it this way, to avoid unexpected results. Even if all she wants is a partial match: <FilesMatch "^.+\.(gif|png|jpg)$"> <DirectoryMatch "^.*/secret/.*$"> Then all the mentioned problems are reduced to only one problem: make your full regex right. No Apache inner workings knowledge necessary. And if you don't know regex, don't mess with it :) Aurelio, that's basically what I mean :) It's correct that the current examples are not strictly wrong, they are just not that what the end-user probably wants. And we should really teach them how to do it proper and safe. My original contained already many suggestions and places on what and where I'd change. It's just that right now, my time is highly limited. Cheers, Chris. Hi Christoph, my intention was to explain the problem in a different way, hopefully easier to understand by those who are not fully aware of it. And maybe my simpler suggestion is easier to implement. I hope that it helps the doc maintainers to get this issue fixed. FWIW, I disagree with using the ?: syntax in the regular expressions. This isn't a regex tutorial, and using advanced regex techniques clouds the issue for people that are trying to learn httpd syntax. I certainly wouldn't -1 a patch, but I think that it makes the document more about regex syntax than about *Match syntax, which feels out of scope. Well I think one could add a one liner, describing what this is about, but apart from that I'd use the better version. The reason is simply, that people will take these example patterns from the official documentation and extend them to their own needs. So why teaching millions of users something worse, when they could easily use something better? Yeah, it's a good point. I'm torn between "simplest possible example that works" and "best practice", when best practice requires significant additional explanation of regular expressions. I'll see if I can find a way to phrase it so that it doesn't simply confuse. Thanks for your remarks. I've patched a bunch of different places, but I haven't yet done the (?:$|/) stuff. So, I'm going to leave this ticket open for now, because I lack time to complete it thoroughly. To summarize: There are some regexes in *Match directive examples where the trailing slash is left off, so that it could match unintended extra characters. In these cases, we (might) want to put either (?:$|/) - a non-capturing "/ or end of string" pattern - or simply a [/$] on the end of it, to ensure that it matches *only* what we intend. (In reply to Rich Bowen from comment #12) > I'll see if I can find a way to phrase it so that it doesn't simply confuse. > Thanks for your remarks. Maybe one should provide at each directive only the most minimal version of a regexp example and then link to a more detailed part of the documentation. This documentation could perhaps still work with the "simple version", but have a section called "In production use" or so,.. where it's explained that every usage of (...) is typically better with the complex form. Yes, I think I'd like to do that. Please help us to refine our list of open and current defects; this is a mass update of old and inactive Bugzilla reports which reflect user error, already resolved defects, and still-existing defects in httpd. As repeatedly announced, the Apache HTTP Server Project has discontinued all development and patch review of the 2.2.x series of releases. The final release 2.2.34 was published in July 2017, and no further evaluation of bug reports or security risks will be considered or published for 2.2.x releases. All reports older than 2.4.x have been updated to status RESOLVED/LATER; no further action is expected unless the report still applies to a current version of httpd. If your report represented a question or confusion about how to use an httpd feature, an unexpected server behavior, problems building or installing httpd, or working with an external component (a third party module, browser etc.) we ask you to start by bringing your question to the User Support and Discussion mailing list, see [https://httpd.apache.org/lists.html#http-users] for details. Include a link to this Bugzilla report for completeness with your question. If your report was clearly a defect in httpd or a feature request, we ask that you retest using a modern httpd release (2.4.33 or later) released in the past year. If it can be reproduced, please reopen this bug and change the Version field above to the httpd version you have reconfirmed with. Your help in identifying defects or enhancements still applicable to the current httpd server software release is greatly appreciated. |