Bug 41960 - Apache is not using custom content-types when accessing content-negotiated resources
Summary: Apache is not using custom content-types when accessing content-negotiated re...
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: Core (show other bugs)
Version: 2.2.4
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: FixedInTrunk, PatchAvailable
Depends on:
Blocks:
 
Reported: 2007-03-27 10:31 UTC by Jose Kahan
Modified: 2008-11-15 15:51 UTC (History)
2 users (show)



Attachments
Restored location walking in subrequests. Optimized released code in 2.2.4 was broken. (1.60 KB, patch)
2007-06-05 08:30 UTC, Jose Kahan
Details | Diff
Restablishes location walking in subrequests (Against v/2.2.4) (1.01 KB, patch)
2007-08-06 01:36 UTC, Jose Kahan
Details | Diff
Restablishes location walking in subrequests (against httpd trunk) (717 bytes, patch)
2007-08-10 07:33 UTC, Jose Kahan
Details | Diff
Restablishes location walking in subrequests (Against v/2.2.8) (682 bytes, patch)
2008-01-31 09:05 UTC, Jose Kahan
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jose Kahan 2007-03-27 10:31:04 UTC
Apache version:

 This error is consistent in both Apache 2.0.54 (debian stable), 
 Apache 2.2.3 (debian testing), and 
 Apache 2.2.4 (official tar-ball compiled in-place),

 This wasn't an error in Apache 1.3.6 (and older 1.3.x versions).

Test scenario:

We're overriding the content-type (and charset) for a specific
location:

Server configuration

<LocationMatch "^/conneg-test">
 Options MultiViews
 AddType text/html;charset=utf-8      html htm
</LocationMatch>

Requesting a file in that directory using the complete URI returns the
custom content-type:

$ HEAD http://localhost/conneg-test/design.html
200 OK
Connection: close
Date: Tue, 27 Mar 2007 17:26:14 GMT
Accept-Ranges: bytes
ETag: "48131-1acb-c5210c00"
Server: Apache/2.2.4 (Unix)
Content-Length: 6859
Content-Type: text/html; charset=utf-8
Last-Modified: Mon, 26 Mar 2007 10:21:36 GMT
Client-Date: Tue, 27 Mar 2007 17:26:14 GMT
Client-Peer: 127.0.0.1:80
Client-Response-Num: 1

Requesting the same file in that directory using conneg returns the
server's default content-type for an ".html .htm" resource. The custom
one has not been taken into account when Apache is doing its internal
subrequest for solving the content-negotiated resource:

$ HEAD http://localhost/conneg-test/design 
200 OK
Connection: close
Date: Tue, 27 Mar 2007 17:27:35 GMT
Accept-Ranges: bytes
ETag: "48131-1acb-c5210c00;a4376980"
Server: Apache/2.2.4 (Unix)
Vary: negotiate
Content-Length: 6859
Content-Location: design.html
Content-Type: text/html
Last-Modified: Mon, 26 Mar 2007 10:21:36 GMT
Client-Date: Tue, 27 Mar 2007 17:27:35 GMT
Client-Peer: 127.0.0.1:80
Client-Response-Num: 1
TCN: choice

Proof:

The server's default content-type is being served for
content-negotiated resources.  If I add the following configuration
directive before the declarations of the virtual servers:

AddType text/my_html;charset=utf-16      html htm

That is the content-type that is returned by my conneg request:

$ HEAD http://localhost/conneg-test/design 
200 OK
Connection: close
Date: Tue, 27 Mar 2007 17:25:43 GMT
Accept-Ranges: bytes
ETag: "48131-1acb-c5210c00;a4376980"
Server: Apache/2.2.4 (Unix)
Vary: negotiate
Content-Length: 6859
Content-Location: design.html
Content-Type: text/my_html; charset=utf-16
Last-Modified: Mon, 26 Mar 2007 10:21:36 GMT
Client-Date: Tue, 27 Mar 2007 17:25:43 GMT
Client-Peer: 127.0.0.1:80
Client-Response-Num: 1
TCN: choice

Requesting http://localhost/conneg-test/design.html still returns the
custom content-type, not the default one.
Comment 1 Basant Kumar Kukreja 2007-03-27 19:19:58 UTC
I am able to reproduce the issue. It works fine in httpd 1.3.x but doesn't work
in httpd 2.2.x as reported.
Comment 2 Basant Kumar Kukreja 2007-03-28 11:20:01 UTC
When we send request /conneg-test/design, in mod_nogotiation walks through the
directory and after it finds a matching design.html then it starts a sub
request calling ap_sub_req_lookup_dirent. (In apache 1.3.x
ap_sub_req_lookup_file is invoked)

Sub request will eventually call ap_process_request_internal (called
process_request_internal in 1.3.x). Processing of ap_process_request_internal
is changed from 2.2.x from 1.3.x. In apache 1.3, location_walk was called
unconditionally while in apache 2.2.x ap_location_walk is called only file_req
is false.

request.c
-------------
    int file_req = (r->main && r->filename);
...
    if (!file_req) {
        if ((access_status = ap_location_walk(r))) {
...


So as a result ap_location_walk is not called for sub request in apache 2.2.x while
it is getting called for 1.3.x.
Because of this difference in logic, different behaviour is observed in
different apache versions ( 1.3.x and  2.2.x)

I guess that the above mentioned change in apache 2.2.x is made because of
performance reason.
Comment 3 Basant Kumar Kukreja 2007-03-28 12:52:18 UTC
As I mentioned before, the issue is not limited to Content type, but the
actual issue is not calling ap_location_walk from ap_process_request_internal.

Suppose, we do the setting like this :

<LocationMatch "^/conneg-test">
    Options MultiViews
    AddHandler htmlhandler .html
    Action htmlhandler /cgi-bin/test.php
</LocationMatch>

------------------------------------------------
If we send the /conneg-test/design.html then /cgi-bin/test.php executes. But
if we send /conneg-test/design request then design.html document is sent.
------------------------------------------------
[/disk/apache/httpd-2.2.x] $ curl --dump-header - -o -
http://lbasantk3:4004/conneg-test/design.html
HTTP/1.1 200 OK
Date: Wed, 28 Mar 2007 19:39:33 GMT
Server: Apache/2.2.5-dev (Unix) mod_ssl/2.2.5-dev OpenSSL/0.9.8a DAV/2
Content-Length: 21
Content-Type: text/plain

test.php cgi script!
------------------------------------------------
[/disk/apache/httpd-2.2.x] $ curl --dump-header - -o -
http://lbasantk3:4004/conneg-test/design HTTP/1.1 200 OK
Date: Wed, 28 Mar 2007 19:39:36 GMT
Server: Apache/2.2.5-dev (Unix) mod_ssl/2.2.5-dev OpenSSL/0.9.8a DAV/2
Content-Location: design.html
Vary: negotiate
TCN: choice
Last-Modified: Wed, 28 Mar 2007 01:12:02 GMT
ETag: "7a79d-2c-53687c80;5377bec0"
Accept-Ranges: bytes
Content-Length: 44
Content-Type: text/html

<HTML>
<BODY>
test message
</BODY>
</HTML>
------------------------------------------------

The behaviour does not seem to be intuitive to apache users.
Comment 4 Jose Kahan 2007-05-11 20:25:11 UTC
Following your suggestion, and after doing some tests, I applied the following 
patch to server/request.c:

[[
*** /tmp/request.c      Sat May 12 04:54:31 2007
--- ./request.c Wed Jul 12 05:38:44 2006
***************
*** 152,161 ****
          return access_status;
      }

!     /* Rerun the location walk, which overrides any map_to_storage config.
       */
!     if ((access_status = ap_location_walk(r))) {
!       return access_status;
      }

      /* Only on the main request! */
--- 152,165 ----
          return access_status;
      }

!     /* Excluding file-specific requests with no 'true' URI...
       */
!     if (!file_req) {
!         /* Rerun the location walk, which overrides any map_to_storage config.
!          */
!         if ((access_status = ap_location_walk(r))) {
!             return access_status;
!         }
      }

      /* Only on the main request! */
]]
 
This solves the problem I originally stated, when using the plain-vanilla 
apache. Could you confirm if this patch works for you too (without any side 
effect)? 

request.c:ap_process_request_internal() does two conditional evaluations
using filereq. By doing tests, I saw that the the patch worked when suppressing
it in the second conditional evaluation. I didn't notice any difference in 
behavior if I supressed it in the first evaluation too.  I'm not sure if you 
had proposed suppressing all filereq evaluations throughout all of  
request.c:ap_process_request_internal(). In all cases, one of 
the consequences of keeping the first evaluation, is that ap_translate_name() 
won't be called when evaluating subrequests... and maybe this is not good.

I was unsure on how to translate
the code differences between apache1.3.37 and apache2.2.4. Apache 1.3.37
returns calls ap_die() for some of the functions functions that are inside
these evaluations, where apache2.2.4 just does a return access_status. Is it 
equivalent in the new architecture?

Could you confirm if this patch works for you (without any side effect)? Or
provide further advice? I'm willing to continue testing and trying code 
changes.
 
When I add our custom aaa modules, I've a side-effect where some of the 
previously authorized requests that returned HTTP 200, are now giving back HTTP 
403. This may come from work arounds I had added to copy the request's context. 
I will look at this in some days and report again.

Thanks!

-jose


Comment 5 Basant Kumar Kukreja 2007-05-21 12:04:52 UTC
Can you provide the patch in diff -u format?
Comment 6 Jose Kahan 2007-06-04 16:32:09 UTC
Sorry for the delay. My laptop broke down.

I continued my tests. I removed all the file_req optimizations in
request.c:ap_process_request_internal() and this solved the bug I had reported,
without any noticeable side-effect on our server. The aaa behavior I had
seen in our modules was due to something else. 

Here's the diff -Naur patch. Could you confirm if this patch works for you
(without any side effect) or provide further advice? I'm willing to continue
testing and trying code 
changes.

Thanks!

-jose

--- request.c.orig	2007-05-31 15:35:57.000000000 +0000
+++ request.c	2007-06-04 23:18:48.000000000 +0000
@@ -101,7 +101,6 @@
  */
 AP_DECLARE(int) ap_process_request_internal(request_rec *r)
 {
-    int file_req = (r->main && r->filename);
     int access_status;
 
     /* Ignore embedded %2F's in path for proxy requests */
@@ -133,16 +132,15 @@
      * next several steps.  Only file subrequests are allowed an empty uri,
      * otherwise let translate_name kill the request.
      */
-    if (!file_req) {
-        if ((access_status = ap_location_walk(r))) {
-            return access_status;
-        }
+    if ((access_status = ap_location_walk(r))) {
+      return access_status;
+    }
 
-        if ((access_status = ap_run_translate_name(r))) {
-            return decl_die(access_status, "translate", r);
-        }
+    if ((access_status = ap_run_translate_name(r))) {
+      return decl_die(access_status, "translate", r);
     }
 
+
     /* Reset to the server default config prior to running map_to_storage
      */
     r->per_dir_config = r->server->lookup_defaults;
@@ -152,14 +150,10 @@
         return access_status;
     }
 
-    /* Excluding file-specific requests with no 'true' URI...
+    /* Rerun the location walk, which overrides any map_to_storage config.
      */
-    if (!file_req) {
-        /* Rerun the location walk, which overrides any map_to_storage config.
-         */
-        if ((access_status = ap_location_walk(r))) {
-            return access_status;
-        }
+    if ((access_status = ap_location_walk(r))) {
+      return access_status;
     }
 
     /* Only on the main request! */

Comment 7 Davi Arnaut 2007-06-05 05:49:40 UTC
Nitpicking: put patches in the Attachments list on a bug.
Comment 8 Jose Kahan 2007-06-05 08:30:28 UTC
Created attachment 20313 [details]
Restored location walking in subrequests. Optimized released code in 2.2.4 was broken.
Comment 9 Basant Kumar Kukreja 2007-06-11 13:01:59 UTC
I think the patch you submitted will work but ...

I fear from comments 
    /* All file subrequests are a huge pain... they cannot bubble through the
     * next several steps.  Only file subrequests are allowed an empty uri,
     * otherwise let translate_name kill the request.
     */
that there may be other consequences of the fix especially performance. I
could not arrive at the fix which behaves well as well as perform better.

Comment 10 Jose Kahan 2007-07-04 13:36:54 UTC
We have been using this patch for over a month in an internal Apache2 
production mirror of our W3C server. While we can't give yet any evaluation of 
its performance as it's not yet being used as an official mirror, we haven't 
detected any side-effect from the patch so far (over 50 staff guys use this 
mirror server in their every day activities).

This is just an empirical observation based on our use and our configuration of 
the Apache2 server. We're still waiting for the approval from the Apache gurus 
at ASF that it is a solid patch.

I'm less worried by the patch affecting performance of a correctly working 
server than by the non-use of the patch affecting the content-negotiation 
skills of a more performant server. The best solution would be to have a patch 
that doesn't affect performance and solves the bug :)
Comment 11 Jose Kahan 2007-08-06 01:36:07 UTC
Created attachment 20600 [details]
Restablishes location walking in subrequests (Against v/2.2.4)

There was a case where the previous patch didn't work. This is a revision (and
simplification) of the previous patch that includes one of the changes commited
to SVN and which solves both conneg issues. These issues are new with respect
to v2.2.4. They used to work on v.1.3.x

Test scenario for this additional conneg bug. Like in the previous scenario,
I assume we have a location called /conneg-test. In this location, there's a
file called test.html

1. Put in the global server configuration:
  AddType text/html;charset=iso-8859-1			  html htm

2. Add the following specific rule for /conneg-test

<LocationMatch "^/conneg-test/">
  Options +Multiviews
  AddType text/html;charset=utf-8	  html htm
</LocationMatch>

3. Get the file using its own location: specific content-type override works

HEAD http://www.example.com/conneg-test/test.html | grep -i Content

Content-Type: text/html; charset=utf-8

4. Get the file using content-negotiation: we get the global content-type.
The specific one wasn't taken into account by the subrequest

HEAD http://www.example.com/conneg-test/test.html | grep -i Content

Content-Location: test.html
Content-Type: text/html; charset=iso-8859-1

----------------

This issue is different from the previous one in that the specific Content-Type
is stated inside the server's configuration, rather than in a .htaccess file.
The revised patch fixes both issues.
Comment 12 Jose Kahan 2007-08-10 07:33:04 UTC
Created attachment 20638 [details]
Restablishes location walking in subrequests (against httpd trunk)

Same patch as before, but this time against Apache's svn trunk (following
Nick Kew's suggestion).
The trunk version has the same bug as 2.2.4. I tested the patch and the
fix is still valid.
Comment 13 William A. Rowe Jr. 2007-09-13 23:09:03 UTC
Thank you for your bug submission and analysis.  Commenting so that my primary
bugzilla filter (open, and comments by myself) keeps your report on my own radar.

Again, thanks for submitting these proposed changes and the detailed explanation
of your proposal.
Comment 14 Nick Kew 2007-09-26 07:28:29 UTC
Patch applied in trunk in r579664.  Better unoptimised than buggy.
Comment 15 Jose Kahan 2008-01-31 09:05:01 UTC
Created attachment 21456 [details]
Restablishes location walking in subrequests (Against v/2.2.8)

Removed a small part of the patch (contributed by someone else) that is already
proposed by 2.2.8.
Comment 16 Nick Kew 2008-11-15 15:46:23 UTC
*** Bug 45959 has been marked as a duplicate of this bug. ***
Comment 17 Nick Kew 2008-11-15 15:51:21 UTC
Was fixed in 2.2.9!