This is almost certainly a variant of bug 15423, but that bug is fixed and this problem still exists. I'm not sure whether it's really a mod_deflate bug or a mod_include bug. If I have a server-parsed HTML file on an Apache 2 server that has mod_deflate enabled for HTML content, and it includes a CGI that issues a redirect to another HTML file, the redirected content isn't compressed, even though the server issues a "Content-Encoding: gzip" header line. My test case: dozer% cat test-apache2-redir.shtml <!--#include virtual="/cgi-bin/redirtest" --> Got to end! dozer% cat foo.html hiya dozer% cat ~/apache/cgi-bin/redirtest #!/bin/sh echo Location: /foo.html echo '' dozer% ( echo 'GET /test-apache2-redir.shtml HTTP/1.1' ; \ echo 'Accept-Encoding: gzip' ; \ echo 'Host: dozer' ; \ echo '' ) | nc dozer 8763 | less HTTP/1.1 200 OK Date: Tue, 04 Mar 2003 05:47:21 GMT Server: Apache/2.0.44 (Unix) mod_fastcgi/mod_fastcgi-SNAP-0210222112 Accept-Ranges: bytes Vary: Accept-Encoding Content-Encoding: gzip Connection: close Transfer-Encoding: chunked Content-Type: text/html; charset=ISO-8859-1 5 hiya 21 ^_<8B^@^@^@^@^@^@^CSp<CF>/Q(<C9>WH<CD>KQ<E4>^B^@<EB><D7>y<BD>^M^@^@^@ 0 The output (with the patch for bug 15423 applied) is correct if I don't accept gzip encoding: dozer% ( echo 'GET /test-apache2-redir.shtml HTTP/1.1' ; \ echo 'Host: dozer' ; \ echo '' ) | nc dozer 8763 HTTP/1.1 200 OK Date: Tue, 04 Mar 2003 05:49:36 GMT Server: Apache/2.0.44 (Unix) mod_fastcgi/mod_fastcgi-SNAP-0210222112 Accept-Ranges: bytes Vary: Accept-Encoding Connection: close Transfer-Encoding: chunked Content-Type: text/html; charset=ISO-8859-1 5 hiya d Got to end! 0
Reproduced here. Will dig into the code ...
does this also happen with the standard mod_cgi ? also in your config, how are turning deflate 'on' ? using Addoutputfilterbytype text/html DEFLATE The 'werid' thing is the defalte filter should not be able to be added to the subrequest. my thinking is that the first bucket has passed through, and then deflate has been added to the filter chain. (does this make sense?)
Yep. It's not a special problem of deflate or mod_cgi. The internal redirect bypasses the content filter (points directly to r->proto_output_filters). See my not-sure-if-ready-patch on the dev list :) My config is: <Directory /documentroot> : : Addoutputfilter includes shtml setoutputfilter deflate </Directory>
*** This bug has been marked as a duplicate of 14451 ***
That is not a dupe of 14451 (unfortunately).
Oops, right, this is still broken in 2.0.46. But your patch (which, for other people reading this, is at http://groups.yahoo.com/group/new-httpd/message/43096 ) works for me under 2.0.46. Any chance this fix will get rolled into the next release?
*** Bug 20945 has been marked as a duplicate of this bug. ***
*** Bug 21162 has been marked as a duplicate of this bug. ***
Hi, I tried the mentioned patch on 2.0.46 (+ new->output_filters = r->output_filters; + new->input_filters = r->input_filters; ) However it does _not_ fix the problem for me. Output created from mod_deflate is still broken.
I've been playing with this example, and have I think a better diagnosis. This is from mod_filter, into which I've now incorporated debugging info from the old mod_diagnostics (I'm planning to check mod_filter into CVS real soon now:-). The deflate filter gets called three times: (1) with an mmap bucket of size zero (before the SSI include) (2) with an mmap bucket from the include, followed by an EOS bucket (3) with an mmap bucket containing text after the SSI include, and *another* EOS. (2) is from the subrequest, and doesn't work because of /* only work on main request/no subrequests */ if (r->main) { ap_remove_output_filter(f); return ap_pass_brigade(f->next, bb); } We can't simply fix that by removing the test: then we'd have two interleaved but different streams of compressed data coming out. Running DEFLATE as a PROTOCOL filter but before prepending the headers (i.e. giving it a new slot in the filter chain) looks as if it should fix this, though I'm not sure. I'm thinking about whether I can fix this as part of mod_filter development.
On further thought, this is a much more general problem than mod_deflate. We should *always* merge data from subrequests *before* exposing them to any (more) output filters. I'm not sure how to fix this, but if mod_filter can do it so as to support the current mod_include and others without change, that would be a useful goal.
Another effect of this bug has popped up with PHP; any response data written using ap_r* may get buffered by the filter inserted into r->output_filters. But then if a subrequest->internal redirect occurs the output from the redirect bypasses the then r->main->output_filters and hence the buffered data is sent out-of-order.
Joe, yes, that's exactly the bug here ;-)
Now you ap_rflush() main request before running sub-request. I think it's bad idea *always* to flush. It make impossible to modify main request from sub-requst, impossible to set http status of main request (impossible make http redirect from sub request) impossible "conditional get" from sub-request. May be there sould be second boolean parameter "to flush or not to flush". Or may be simple call php's flush() before virtual().
(In reply to comment #15) > Now you ap_rflush() main request before running sub-request. Sorry, ehm, who exactly are you meaning with "you"? I have problems to put your comment in a proper context.
(In reply to comment #16) > Sorry, ehm, who exactly are you meaning with "you"? I have problems to put your > comment in a proper context. Ok. Look into php's virtual function: since php 5.0.3 added call ap_rflush() before running sub-request. And now in php 5.0.4 impossible to add or change http headers (or change http status) in main request from sub-request because headers already sended to user (sometimes it's good, sometimes bad), php-script developer should select this behaviour. ============ index.php <?php //... if ($should_we_handle_request) { handle_http_request(); } else { # call another handler virtual("/call-mod_perl"); } ?> call-mod_perl.pm sub handler : method { my $class = (@_ >= 2) ? shift : __PACKAGE__; # $r is Apache2::RequestRec my $r = shift; #... some work if ($need_redirect) { if ($r->is_initial_req()) { # this is main request, no php here $r->headers_out->set('Location' => $where_to_redirect); return Apache2::Const::REDIRECT; } else { # this is the sub-request (may be from php) my $main_r = $r->main(); # set Location: in main request not in sub-request $main_r->headers_out->set('Location' => $where_to_redirect); # return OK instead REDIRECT because it is sub-request's status return Apache2::Const::OK; } } } httpd.conf <Location "/call-mod_perl"> SetHandler modperl PerlResponseHandler call-mod_perl->handler </Location> =========== in this example redirect will not work if ap_rflush() main request before running sub-request (php 5.0.4), because user already got HTTP status 200.
Oh well, then "you" is bugs.php.net :-)
The flush is a necessary workaround for this bug, there is no point in asking PHP to be changed until this bug is fixed.
(In reply to comment #19) > The flush is a necessary workaround for this bug, there is no point in asking > PHP to be changed until this bug is fixed. what about integer virtual ( string filename, integer not_to_flush=0 )
The PHP developers are not going to make interface changes to work around 2.0 bugs.
(In reply to comment #21) > The PHP developers are not going to make interface changes to work around 2.0 bugs. The PHP developers not fix any bug. They make work around by changing behaviour of virtual() function. Will GLibc herd change behaviour of printf() because XYZ monitor has a bug? -)
Sorry for spam this bugzilla, php's bugzilla is strange. for Joe Orton What sub-request can to do? It can: - add content (most usable) - remove and may be change content - add, remove and change http headers - change http status - add, remove output filters if you flush main request before running sub-request then only 1-st is avaible (add content only)! Are you really want it? May be user can do: <?php flush(); virtual('add_content_only.html'); ?> if hi really need it, without any workarounds?
Vitaly, the workaround added to PHP is necessary to prevent garbled out-of-order content being sent (see php bug 30446) due to this 2.0 bug. The issues you are describing are far less serious than out-of-order content in a common use of virtual(). Therefore the workaround in PHP will stay until either someone comes up with a better workaround, or this 2.0 bug is fixed. If you have any input on how to achieve the former, please bring it up on the php internals@ list. Otherwise, this discussion is off-topic here.
hello,something bother me about virtual function. example: main.php <? echo "before virtual"; $res="KKKKKKKKKK"; request($res); $res.="EEEEEEEEEE"; echo $res; function request(&$res) { $res.="----------header---------\n"; virtual("myfunc.php"); $res.="-----------footer------\n"; echo "xxxxxxxxx".$res ."yyyyyyyyy"; } ?> myfunc.php: <? echo "in my func"; myfunc(); exit();//look here. #1 function myfunc() { echo "in sub func"; //exit(); //look here #2 } ?> different place of exit function got different response. #1: before virtualin my funcin sub funcxxxxxxxxxKKKKKKKKKK----------header--------- -----------footer------ yyyyyyyyyKKKKKKKKKK----------header--------- ---------- -footer------ EEEEEEEEEE #2: before virtualin my funcin sub funcxxxxxxxxx-----------footer------ yyyyyyyyyKKKKKKKKKK----------header--------- EEEEEEEEEE I think, #1 and #2 should gave the same response.Is there anything wrong?
*** Bug 40693 has been marked as a duplicate of this bug. ***
Hello! What is the status of this bug? At the moment i cannot use mod_deflate with mod_fcgid and mod_include... and this bug is open since 2003-03-04... Stefan
Hello! Is there a workaround? I've tried something like: SetEnvIf REQUEST_URI \.shtml$ no-gzip But that also does not work... Stefan
Hi, Is there any chance this issue will get fixed in a new mod_filter implementation? mod_deflate (or any FilterChain with more than one filter) is now unusable to anyone who does not know in advance if some of the processing will require subrequests. By the way, in httpd 2.2.3, when a file is run through mod_include containing a subrequest, and afterwards through mod_deflate, the output will not be mixed gzipped/ non-gzipped, but will be not compressed. The only problem now is, that mod_deflate still adds the Content-Encoding: gzip, which confuses the client... So this html-file: --- <html> <body><!--#include virtual="/output.cgi"--></body> </html> ----- and this output.cgi file: --- #!/bin/bash echo 'Content-Type: text/html' echo '' echo 'This is the output' --- Will be delivered, when ran through mod_include and mod_deflate as: ------ HTTP/1.1 200 OK Date: Tue, 10 Mar 2009 13:18:05 GMT Server: Apache Accept-Ranges: bytes Vary: Accept-Encoding Content-Encoding: gzip Connection: close Content-Type: text/html This is the output </body> </html> ----- As you can see: the body is not encoded. (and there isn't even a redirect in the cgi, it is run through a handler and just outputs the content). Jelmer
I believe I have identified the cause of this issue and have a resolution. mod_filter appears to be the only module that uses the filter_init_func field of the ap_filter_rec_t structure. This member is a pointer to function that is to be run right after the insert_filter hooks are run and right before the content handler is invoked. As most modules do their initialization at the time they are called to process data, this feature is largely unused. However, mod_filter chooses to do its initialization here so that it can correctly call the filter_init_func of each provider if it exists. mod_filter sets filter_init_func to be a pointer to its filter_init function. In filter_init, it first creates the context for the harness, before calling filter_init_func of each provider. The initial context for each provider and the ongoing context of the subsequently invoked provider are all stored in the harness context. The problem arises in that mod_filter intuitively assumes that filter_init will only ever be called once during the lifetime of the filter harness. However, when mod_include creates a subrequest, all of the filters from the main request get copied into the subrequest. Before the content handler is invoked in the subrequest, the filter_init_func of every filter in the subrequest gets called. This means that the filter_init_func of the filter harness gets called again, one addition time for each subrequest. The first thing that mod_filter does in filter_init is to assign the context pointer to freshly allocated memory. When called more than once, this essentially destroys the existing harness context and any provider contexts that it contains. In the case of mod_deflate, it loses any data waiting in the compression buffer. Furthermore, when mod_deflate is invoked again without a context, it assumes it is being called for the first time in that request. When it sees that Content-Encoding is already set to gzip, it removes itself from the filter chain, leaving the remaining output uncompressed. I don't know if it's considered correct or ideal operation that filter_init_func can be called more than once, but as it stands, mod_filter can't assume that it doesn't. I am attaching a small patch that will make filter_init first check for an existing harness context, simply returning if it finds one already exists. I tested this patch with 2.2.14.
Created attachment 25350 [details] Prevent filter_init in mod_filter from destroying it's own harness context This should resolve the issue. This patch was tested with 2.2.14.
(In reply to comment #31) > Created an attachment (id=25350) [details] > Prevent filter_init in mod_filter from destroying it's own harness context > > This should resolve the issue. This patch was tested with 2.2.14. I tested with the reproducer I found as described here using a virtual ssi and mod_deflate/mod_filter with your patch - see users archive for full details: http://mail-archives.apache.org/mod_mbox/httpd-users/200803.mbox/%3C48c2022e0803250736p4ce51f8diad6f5264ba123a94@mail.gmail.com%3E Built using 2.2.15 with your patch this still produces incorrect input * About to connect() to localhost port 8081 (#0) * Trying ::1... connected * Connected to localhost (::1) port 8081 (#0) > GET /test.shtml HTTP/1.1 > User-Agent: curl/7.16.4 (i386-apple-darwin9.0) libcurl/7.16.4 OpenSSL/0.9.7l zlib/1.2.3 > Host: localhost:8081 > Accept: */* > Accept-Encoding: gzip > < HTTP/1.1 200 OK < Date: Mon, 26 Apr 2010 05:57:07 GMT < Server: Apache/2.2.15 (Unix) < Accept-Ranges: bytes < Vary: Accept-Encoding < Content-Encoding: gzip < Transfer-Encoding: chunked < Content-Type: text/html < { [data not shown] % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 49 0 49 0 0 8305 0 --:--:-- --:--:-- --:--:-- 0* Connection #0 to host localhost left intact * Closing connection #0 paul-nasrats-macbook:httpd-2.2.15 pnasrat$ hexdump -C ~/test 00000000 3c 68 31 3e 49 6e 63 6c 75 64 65 64 3c 2f 68 31 |<h1>Included</h1| 00000010 3e 0a 1f 8b 08 00 00 00 00 00 00 03 0b 0e 71 0c |>.............q.| 00000020 0a e1 e2 72 f5 73 e1 02 00 bd 52 71 7a 0b 00 00 |.??r?s?..?Rqz...| 00000030 00 |.| 00000031
I may have posted this fix to the wrong original bug. However, this bug has a lot of posts referring to the issue with combining mod_include with mod_filter. It now appears that the original bug is unrelated to the mod_filter + mod_include issue. I will try to find the correct bug.
(In reply to comment #33) > I may have posted this fix to the wrong original bug. However, this bug has a > lot of posts referring to the issue with combining mod_include with mod_filter. > It now appears that the original bug is unrelated to the mod_filter + > mod_include issue. I will try to find the correct bug. Did you find a more correct bug for this issue? Experiencing it here on 2.2.15 with mod_filter, ssi and deflate. Preventing a full conversion to mod_filter unfortunately.
(In reply to comment #34) > Did you find a more correct bug for this issue? > > Experiencing it here on 2.2.15 with mod_filter, ssi and deflate. Preventing a > full conversion to mod_filter unfortunately. I have opened bug 49328 to address the mod_filter + mod_include + mod_deflate issue. If you are specifically experiencing corrupt output after a file is included with mod_include and mod_deflate is being used from within mod_filter, I would track that bug instead.
I'm using this patch to address this issue it is working fine. --- http/http_request.c 2006-07-12 05:38:44.000000000 +0200 +++ ../http_request.c 2006-10-07 19:29:33.000000000 +0200 @@ -374,17 +374,27 @@ static request_rec *internal_internal_re new->proto_output_filters = r->proto_output_filters; new->proto_input_filters = r->proto_input_filters; - new->output_filters = new->proto_output_filters; - new->input_filters = new->proto_input_filters; - if (new->main) { + new->output_filters = r->output_filters; + new->input_filters = r->input_filters; + /* Add back the subrequest filter, which we lost when * we set output_filters to include only the protocol * output filters from the original request. + * + * XXX: This shouldn't be neccessary any longer, because the filter + * is still in place -- isn't it? */ ap_add_output_filter_handle(ap_subreq_core_filter_handle, NULL, new, new->connection); } + else { + /* In subrequests we _must_ point to the complete upper request's + * filter chain, so skip the filters _only_ within the main request. + */ + new->output_filters = new->proto_output_filters; + new->input_filters = new->proto_input_filters; + } update_r_in_filters(new->input_filters, r, new); update_r_in_filters(new->output_filters, r, new); @@ -438,10 +448,19 @@ AP_DECLARE(void) ap_internal_fast_redire r->subprocess_env = apr_table_overlay(r->pool, rr->subprocess_env, r->subprocess_env); - r->output_filters = rr->output_filters; - r->input_filters = rr->input_filters; + /* copy the filters _only_ within the main request. In subrequests + * we _must_ point to the upper requests' filter chain, so do not + * touch 'em! + */ + if (!r->main) { + r->output_filters = rr->output_filters; + r->input_filters = rr->input_filters; + } if (r->main) { + /* XXX: This shouldn't be neccessary any longer, because the filter + * is still in place -- isn't it? + */ ap_add_output_filter_handle(ap_subreq_core_filter_handle, NULL, r, r->connection); }
The patch in comment 36 does indeed fix the reproducer I was seeing: Tested against 2.2.15 with both Alex's patch and Stefan's patch: hexdump -C ~/test 00000000 1f 8b 08 00 00 00 00 00 00 03 0b 0e 71 0c 0a e1 |............q..?| 00000010 b2 c9 30 b4 f3 cc 4b ce 29 4d 49 4d b1 d1 07 72 |??0???K?)MIM??.r| 00000020 b8 b8 5c fd 5c b8 00 4a 99 4a 1a 1d 00 00 00 |??\?\?.J.J.....| paul-nasrats-macbook:~ pnasrat$ gzcat /Users/pnasrat/test START <h1>Included</h1> END As there seems to be multiple issues conflated in this bug, if this is what is meant to be tracked in bug 49328 then Stefan can you attach the patch there?
I don't think Stefan's patch applies to bug 49328, as he is patching internal_internal_redirect and ap_internal_fast_redirect, which is called by mod_cgi(d) and/or mod_fastcgi. In my example, no CGIs are used to repeat the incorrect behavior. mod_include is calling make_sub_request directly, not making internal redirects. However, I think it's quite possible these bugs might be endemic of the same underlying design decision. It seems that there are a number of cases where one would want to operate only on the filters that were pushed during the current subrequest and not any filters that we inherited from a parent request. In bug 49328, only the filters that were added in the subrequest should have had their filter_init_func called, but the filters inherited from the parent are re-initialized along with the new ones. In bug 17629, only the filters that were added to the request that triggered the redirect should be removed, but since all of the filters except the protocol filters are removed, the data generated by the redirect gets passed straight to the protocol filters instead of first passing through its parent filters. That being said, I don't think that Stephan's patch correctly handles all cases of internal redirects, even through it does happen to work for the case listed in this bug. Instead of special-casing the behavior based on whether the current request is a main request or a subrequest, it should simply copy the parent request's filters (which would just consist of the protocol filters if we are the main request) into the subrequest, thus discarding any filters added in the current request. I think implementing the fix in such a way would at least entail creating new fields in the request_rec that represent the lists of input and output filters copied from the parent, establishing new invariants for how these lists should be treated, and then modification of request-handling logic in several files to take these new lists into account. Such a fix may also address bug 43939, which I suspect is being cause by an internal redirect in mod_dir.
Test case for this added: http://svn.apache.org/viewvc?rev=947641&view=rev
My "fix" fixes the problem with mod_deflate and ssi - nothing more.
Doing this: if (new->main) { + new->output_filters = r->output_filters; + new->input_filters = r->input_filters; seems attractive, and clearly fixes this particular bug, but I think it's a regression. The output of the internal redirect should be equivalent to a client following an ("external", if you will) HTTP redirect. Inheriting the entire filter chain for an internal redirect, any (resource/content-level) filters which applied to the original location would be applied to the redirect location. Imagine you have some content-transforming filter applied to resources in /a/ but not in /b/. Any internal redirect from /a/foo to /b/bar should hence act as if the client requested /b/bar directly -- without that filter applied.
Created attachment 25544 [details] proposed patch Proposed patch for this bug. This changes only the handling of internal redirect, not internal fast redirects (as seen with mod_dir, for example).
I can't convince myself that this issue does not exist with internal fast redirects too (mod_dir, mod_negotiation). But, per bug 43939, the changes on the trunk seem to be sufficient.
I've committed my proposed patch to the trunk: http://svn.apache.org/viewvc?view=revision&revision=952828 I've rediffed for 2.2.x and uploaded here: http://people.apache.org/~jorton/ap22_pr17629.patch further results from testing are very welcome!
Backported to 2.2.x as r988400.