Bug 41362 - mod_disk_cache and mod_include conflict generating incorrect pages from cache
Summary: mod_disk_cache and mod_include conflict generating incorrect pages from cache
Status: RESOLVED LATER
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_cache_disk / mod_disk_cache (show other bugs)
Version: 2.2.4
Hardware: All All
: P2 major (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: MassUpdate
Depends on:
Blocks:
 
Reported: 2007-01-14 14:55 UTC by Giuliano Gavazzi
Modified: 2018-11-07 21:07 UTC (History)
1 user (show)



Attachments
diff showing the debugging patches to version 2.2.4 in mod_disk_cache.c (802 bytes, patch)
2007-01-15 16:57 UTC, Giuliano Gavazzi
Details | Diff
diff -u as required (1.15 KB, patch)
2007-01-16 00:38 UTC, Giuliano Gavazzi
Details | Diff
patch against mod_disk_cache.c r470455 that can be used in 2.2.4 (6.20 KB, patch)
2007-01-16 17:07 UTC, Giuliano Gavazzi
Details | Diff
GDB macros for debugging (7.07 KB, text/plain)
2007-01-18 12:45 UTC, Ruediger Pluem
Details
A debug session as per instructions (1.86 KB, text/plain)
2007-01-18 15:51 UTC, Giuliano Gavazzi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Giuliano Gavazzi 2007-01-14 14:55:50 UTC
Tested in FreeBSD 6.1 and MacOSX 10.4.x.
Simple to reproduce (note added a test at the bottom using r470455):

create a SSI file in your DocumentRoot:

$ cat > DocumentRoot/test.shtml<<EOF
<html>
<head>
        <title>cache-bug-test</title>
</head>
<body>
<p>just before the exec call</p>
<!--#exec cmd="./foo.sh" -->
<p>just after the exec call</p>

</body>
</html>
EOF

and the commad to execute:

$ cat > DocumentRoot/foo.sh <<EOF    
#!/bin/sh
echo bar
EOF

to be chmod 755

in the VirtualHost configuration:

    CacheEnable disk /
    CacheRoot /var/www/cache
    CacheIgnoreNoLastMod On
    CacheDefaultExpire 3600
    CacheMaxExpire 3600
    CacheMinFileSize 64
    CacheMaxFileSize 64000

the first time the file is accessed it is server correctly to the client (indented for clarity):

	<html>
	<head>
		<title>cache-bug-test</title>
	</head>
	<body>
	<p>just before the exec call</p>
	bar
	
	<p>just after the exec call</p>
	
	</body>
	</html>


the second time it is served as:

	<html>
	<head>
		<title>cache-bug-test</title>
	</head>
	<body>
	<p>just before the exec call</p>
	<html>
	<head>
		<title>cache-bug-test</title>
	</head>
	<body>
	<p>just before the exec call</p>
	bar
	
	<p>just after the exec call</p>
	
	</body>
	</html>

that is: the correct output is preceded by the raw file up to and including the SSI instruction.

I inserted some ap_log_*error calls in mod_disk_cache.c, one in particular in store_body:


        rv = apr_bucket_read(e, &str, &length, APR_BLOCK_READ);
		//GG 3 lines
		ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
					 "cache_disk: read from server [%s]",
					 str);

perhaps not really a proper description. Anyway, I get:

[Sun Jan 14 22:27:21 2007] [debug] mod_disk_cache.c(999): cache_disk: read from server [<html>\r
\n<head>\r\n\t<title>cache-bug-test</title>\r\n</head>\r\n<body>\r\n<p>just before the exec 
call</p>\r\n<!--#exec cmd="./foo.sh" -->\r\n<p>just after the exec call</p>\r\n\r\n</body>\r
\n</html>\r\n]
[Sun Jan 14 22:27:21 2007] [debug] mod_disk_cache.c(999): cache_disk: read from server [bar\n]
[Sun Jan 14 22:27:21 2007] [debug] mod_disk_cache.c(999): cache_disk: read from server []
[Sun Jan 14 22:27:21 2007] [debug] mod_disk_cache.c(999): cache_disk: read from server [\r\n<p>just 
after the exec call</p>\r\n\r\n</body>\r\n</html>\r\n]
[Sun Jan 14 22:27:21 2007] [debug] mod_disk_cache.c(999): cache_disk: read from server [(null)]

so, first it reads the whole file unparsed by mod_include (that's bad enough), then it gets the result 
from the parsing down to the end of the file. This is odd, and what puzzles me more: it is not even 
what it going to be served from cache, as you saw above.


I tested then with r470455 of mod_cache (grafted in a 2.2.4 distribution) and found that the output is 
fine but the content is stored each time a client requests it (and the serving crawls..)

Might be related to related to: http://issues.apache.org/bugzilla/show_bug.cgi?id=41200
Comment 1 Ruediger Pluem 2007-01-15 03:09:09 UTC
Could please supply the full virtual host configuration? Especially how you
trigger mod_include (via handler, SetOutfilter, etc.)?
Comment 2 Ruediger Pluem 2007-01-15 03:14:13 UTC
Please attach also an error log in debug mode for two requests against 2.2.4
(first one which work and second one from cache which fails).
Comment 3 Giuliano Gavazzi 2007-01-15 04:33:22 UTC
(In reply to comment #2)
> Please attach also an error log in debug mode for two requests against 2.2.4
> (first one which work and second one from cache which fails).

Thanks for looking at this!

The logs (trimmed of day and year to shorten the lines, note the lines with GG are my extra log 
described in my original report:

Cache cleared. first request (correct data served):

[13:18:10] [debug] mod_cache.c(129): Adding CACHE_SAVE filter for /new/test.shtml
[13:18:10] [debug] mod_cache.c(136): Adding CACHE_REMOVE_URL filter for /new/test.shtml
[13:18:10] [debug] mod_cache.c(609): cache: Caching url: /new/test.shtml
[13:18:10] [debug] mod_cache.c(615): cache: Removing CACHE_REMOVE_URL filter.
[13:18:10] [debug] mod_cache.c(658): cache: Added date header
[13:18:10] [debug] mod_disk_cache.c(964): disk_cache: Stored headers for URL http://
www.elfo.org.localhost:80/new/test.shtml?
[13:18:10] [debug] mod_disk_cache.c(999): GG cache_disk: read from server [<html>\r\n<head>\r\n
\t<title>cache-bug-test</title>\r\n</head>\r\n<body>\r\n<p>just before the exec call</p>\r
\n<!--#exec cmd="./foo.sh" -->\r\n<p>just after the exec call</p>\r\n\r\n</body>\r\n</html>\r
\n]
[13:18:10] [debug] mod_disk_cache.c(999): GG cache_disk: read from server [bar\n]
[13:18:10] [debug] mod_disk_cache.c(999): GG cache_disk: read from server []
[13:18:10] [debug] mod_disk_cache.c(999): GG cache_disk: read from server [\r\n<p>just after the 
exec call</p>\r\n\r\n</body>\r\n</html>\r\n]
[13:18:10] [debug] mod_disk_cache.c(999): GG cache_disk: read from server [(null)]
[13:18:10] [debug] mod_disk_cache.c(1058): disk_cache: Body for URL http://www.elfo.org.localhost:
80/new/test.shtml? cached.
[13:18:10] [debug] mod_cache.c(129): Adding CACHE_SAVE filter for /favicon.ico
[13:18:10] [debug] mod_cache.c(136): Adding CACHE_REMOVE_URL filter for /favicon.ico
[13:18:10] [error] [client 127.0.0.1] File does not exist: /Users/g/Documents/Projects/HTML/elfo-site/
elfo.org/trunk/favicon.ico


second request (incorrect data served from cache, but note too that the cache file itself contains 
incorrect data):

[13:18:41] [debug] mod_disk_cache.c(477): disk_cache: Recalled cached URL info header http://
www.elfo.org.localhost:80/new/test.shtml?
[13:18:41] [debug] mod_disk_cache.c(750): disk_cache: Recalled headers for URL http://
www.elfo.org.localhost:80/new/test.shtml?
[13:18:41] [debug] mod_cache.c(263): cache: running CACHE_OUT filter
[13:18:41] [debug] mod_cache.c(277): cache: serving /new/test.shtml
[13:18:41] [debug] mod_cache.c(129): Adding CACHE_SAVE filter for /favicon.ico
[13:18:41] [debug] mod_cache.c(136): Adding CACHE_REMOVE_URL filter for /favicon.ico
[13:18:41] [error] [client 127.0.0.1] File does not exist: /Users/g/Documents/Projects/HTML/elfo-site/
elfo.org/trunk/favicon.ico

SSI enabled in main config by:

<IfModule mime_module>
[...]
    AddType text/html .shtml
    AddOutputFilter INCLUDES .shtml
</IfModule>



virtual host configuration:

NameVirtualHost *:80

<VirtualHost *:80>
    ServerAdmin webmaster@dummy-host.example.com
    DocumentRoot /Users/g/Documents/Projects/HTML/elfo-site/elfo.org/trunk
    ServerName www.elfo.org.localhost
    ErrorLog logs/www.elfo.org.localhost-error_log
    CustomLog logs/www.elfo.org.localhost-access_log common
    UseCanonicalName On

    CacheEnable disk /
    CacheRoot /var/www/cache
    CacheIgnoreNoLastMod On
    CacheDefaultExpire 3600
    CacheMaxExpire 3600
    CacheMinFileSize 64
    CacheMaxFileSize 64000
    CacheDisable /mail
    CacheDisable /Webmail
    CacheDisable /webmail

<Directory "/Users/g/Documents/Projects/HTML/elfo-site/elfo.org/trunk">
    Options Includes Indexes FollowSymLinks ExecCGI
    AllowOverride None
    Order allow,deny
    Allow from all
    AddHandler cgi-script .cgi
</Directory>

</VirtualHost>

I have done further research on the r470455 which saves the correct body data, but it does not serve 
the cached copy as it gets stuck in open_header_timeout until the timeout of 10 seconds fires.
Comment 4 Ruediger Pluem 2007-01-15 08:08:36 UTC
(In reply to comment #3)


> [13:18:10] [debug] mod_disk_cache.c(964): disk_cache: Stored headers for URL
http://
> www.elfo.org.localhost:80/new/test.shtml?

The line number above indicates that this is NOT the mod_disk_cache delivered
with 2.2.4 (it should be 954 instead of 964) or have you added further debugging
code above this section?:

Please use either the mod_disk_cache.c delivered with 2.2.4 or attach the
debugging patch you have applied to mod_disk_cache.c. This makes debugging and
analyzing your logs easier. Thanks.
Comment 5 Giuliano Gavazzi 2007-01-15 16:57:27 UTC
Created attachment 19408 [details]
diff showing the debugging patches to version 2.2.4  in mod_disk_cache.c
Comment 6 Giuliano Gavazzi 2007-01-16 00:38:30 UTC
Created attachment 19409 [details]
diff -u as required

please delete the obsoleted diff
Comment 7 Giuliano Gavazzi 2007-01-16 17:07:39 UTC
Created attachment 19417 [details]
patch against mod_disk_cache.c r470455 that can be used in 2.2.4

Grafting r470455 of the cache module in httpd 2.2.4 after applying the
following patch seems to solve the problem. Please comment. Thank you
Comment 8 Ruediger Pluem 2007-01-18 12:45:02 UTC
This is weird. Like in PR41200 I cannot reproduce this on my system. I get the
whole SSI processed file saved in the cache and not its source. Could you please
do the following with an unpatched 2.2.4?

1. Copy the attached .gdbinfo file into the directory of the user you use for 
   attaching gdb to httpd (possibly root).

2. Start httpd -X

3. Attach gdb to this process.

4. Issue the following gdb commands
   break mod_cache.c:810
   break mod_cache.c:350
   cont

5. Request test.shtml via browser

6. Once gdb returns to the prompt issue the following commands
   dump_filters r->output_filters
   dump_brigade in
   cont
   dump_filters r->output_filters
   dump_brigade in
   cont
   dump_filters r->output_filters
   dump_brigade in
   quit

Thanks for your help.
Comment 9 Ruediger Pluem 2007-01-18 12:45:58 UTC
Created attachment 19427 [details]
GDB macros for debugging
Comment 10 Giuliano Gavazzi 2007-01-18 15:49:18 UTC
darn! Now I cannot fully reproduce the bug. Originally I found this on FreeBSD 6.1-STABLE with httpd 
2.2.3. Then I tested on MacOSX 10.4.x httpd 2.2.4 and found the same problem. Now, still on MacOSX 
and 2.2.4 I see only half of the problem. But I include the result of the debug session.
First let me say that what I still observe is this:

1) the file il cached but a subsequent request recaches it.
2) the content of the data file is incorrect (and is the same as it used to be the served data):

***** cache
<html>
[...]
<p>just before the exec call</p>
<html>
[...]
<p>just before the exec call</p>
bar

<p>just after the exec call</p>

</body>
</html>

3) the served data is correct:

<html>
[...]
<p>just before the exec call</p>
bar

<p>just after the exec call</p>

</body>
</html>


Debug results: attached.
Please note that it hit the breakpoints once each. The content was completely served after continuing 
past the second one.
Comment 11 Giuliano Gavazzi 2007-01-18 15:51:30 UTC
Created attachment 19429 [details]
A debug session as per instructions

And thank you again!
Comment 12 Giuliano Gavazzi 2007-01-19 03:35:09 UTC
stupid me. I could now reproduce the bug and found out why it did not appear when using curl. Curl send 
a Pragma: no-cache.
With Safari the second time I GET the file it returns, like in my original report, the incorrect version stored 
in the cache. The same with curl if I set CacheIgnoreCacheControl.
Obviously at the second GET the debug session does not hit the breakpoints, so I do not attach any new 
debug output.
Comment 13 Philp M. Gollucci 2009-01-18 16:19:16 UTC
CC myself on FreeBSD related bugs
Comment 14 William A. Rowe Jr. 2018-11-07 21:07:57 UTC
Please help us to refine our list of open and current defects; this is a mass update of old and inactive Bugzilla reports which reflect user error, already resolved defects, and still-existing defects in httpd.

As repeatedly announced, the Apache HTTP Server Project has discontinued all development and patch review of the 2.2.x series of releases. The final release 2.2.34 was published in July 2017, and no further evaluation of bug reports or security risks will be considered or published for 2.2.x releases. All reports older than 2.4.x have been updated to status RESOLVED/LATER; no further action is expected unless the report still applies to a current version of httpd.

If your report represented a question or confusion about how to use an httpd feature, an unexpected server behavior, problems building or installing httpd, or working with an external component (a third party module, browser etc.) we ask you to start by bringing your question to the User Support and Discussion mailing list, see [https://httpd.apache.org/lists.html#http-users] for details. Include a link to this Bugzilla report for completeness with your question.

If your report was clearly a defect in httpd or a feature request, we ask that you retest using a modern httpd release (2.4.33 or later) released in the past year. If it can be reproduced, please reopen this bug and change the Version field above to the httpd version you have reconfirmed with.

Your help in identifying defects or enhancements still applicable to the current httpd server software release is greatly appreciated.