Bug 36751 - Apache + OpenSSL crashes while render mod_perl pages
Summary: Apache + OpenSSL crashes while render mod_perl pages
Status: RESOLVED INVALID
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_ssl (show other bugs)
Version: 2.0.54
Hardware: PC other
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
: 36557 (view as bug list)
Depends on:
Blocks:
 
Reported: 2005-09-21 02:43 UTC by Shien Xie
Modified: 2005-09-28 04:39 UTC (History)
2 users (show)



Attachments
Local variables are in the file. (65.78 KB, text/plain)
2005-09-23 22:46 UTC, Shien Xie
Details
backtrace info while mod_perl crashes Apache (18.70 KB, text/plain)
2005-09-27 21:47 UTC, Shien Xie
Details
ms (vc2003) dump of perl calling exit(9) (274.65 KB, application/octet-stream)
2005-09-28 12:39 UTC, Issac Goldstand
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Shien Xie 2005-09-21 02:43:26 UTC
We have a Apache server 2.0.54 running on Windows XP Pro.


In httpd.conf, we created a few mod_perl filters to render web pages.
we use all of the following:
PerlResponseHandler 
PerlOutputFilterHandler 
PerlFixupHandler 

The apache server does not have problem if we run the apache binary(msi/no-ssl) 
without SSL.

Then, we decided to enable SSL.
I downloaded the source code of Apache/2.0.54 and openssl/0.9.7g and compiled 
them with Microsoft Visual Studio .NET 2003.

Apache can startup without any problem, however, while we try to load some web 
pages, especially when load pages very frequently. The child process will crash 
and the browser stuck for a long time while the child respowns.
In error.log, i saw this:
Apache2::Filter: (620019) APR does not understand this error code at -e line 0
[Tue Sep 20 18:35:18 2005] [notice] Parent: child process
ed with status 9 -- Restarting.

The problem does not happen all the time. It happens randomly after you try to 
load a few pages.
It seems that mod_perl call APR to translate some error code but unsucessful.
Then apache killed itself afterwards(probably because of mod_perl).


The environment is:
Apache/2.0.54 (Win32) mod_ssl/2.0.54 OpenSSL/0.9.7g DAV/2 mod_perl/2.0.0 
Perl/v5.8.6

Apache/2.0.54 and openssl/0.9.7g are compiled by Microsoft Visual Studio .NET 
2003.

I also tried mod_perl 2.0.1 and openssl 0.9.8 but not helpful.
Comment 1 Joe Orton 2005-09-21 10:19:13 UTC
If you think that mod_perl has killed the server then this is a mod_perl bug; see:
http://perl.apache.org/docs/2.0/user/help/help.html#Reporting_Problems

If you think there is some mod_ssl issue in here then you need to demonstrate it
e.g. by getting a backtrace from a crash.
Comment 2 Shien Xie 2005-09-23 22:46:02 UTC
Created attachment 16505 [details]
Local variables are in the file.
Comment 3 Shien Xie 2005-09-23 22:52:35 UTC
Here is the backtrace info. (From MS Visual Studio .NET 2003)

everytime when you load a page from browser,

function 
static int ssl3_write_pending(SSL *s, int type, const unsigned char *buf,
			      unsigned int len)
is called.

This is the call stack.
	ssleay32.dll!ssl3_write_pending(ssl_st * s=0x06bd0ee8, int type=23, 
const unsigned char * buf=0x06bbb26c, unsigned int len=4775)  Line 756	C
 	ssleay32.dll!do_ssl3_write(ssl_st * s=0x06bd0ee8, int type=23, const 
unsigned char * buf=0x06bbb26c, unsigned int len=4775, int 
create_empty_fragment=0)  Line 713 + 0x15	C
 	ssleay32.dll!ssl3_write_bytes(ssl_st * s=0x06bd0ee8, int type=23, const 
void * buf_=0x06bbb26c, int len=4775)  Line 542 + 0x1a	C
 	ssleay32.dll!ssl3_write(ssl_st * s=0x06bd0ee8, const void * 
buf=0x06bbb26c, int len=4775)  Line 1721 + 0x13	C
 	ssleay32.dll!SSL_write(ssl_st * s=0x06bd0ee8, const void * 
buf=0x06bbb26c, int num=4775)  Line 878 + 0x15	C
 	mod_ssl.so!ssl_filter_write(ap_filter_t * f=0x06b80538, const char * 
data=0x06bbb26c, unsigned int len=4775)  Line 773 + 0x13	C
 	mod_ssl.so!ssl_io_filter_output(ap_filter_t * f=0x06b80538, 
apr_bucket_brigade * bb=0x06bad4c0)  Line 1364 + 0x11	C
 	libhttpd.dll!ap_pass_brigade(ap_filter_t * next=0x06b80538, 
apr_bucket_brigade * bb=0x06bad4c0)  Line 512 + 0x10	C
 	libhttpd.dll!chunk_filter(ap_filter_t * f=0x06bad630, 
apr_bucket_brigade * b=0x06bad4c0)  Line 218 + 0x10	C
 	libhttpd.dll!ap_pass_brigade(ap_filter_t * next=0x06bad630, 
apr_bucket_brigade * bb=0x06bad4c0)  Line 512 + 0x10	C
 	libhttpd.dll!ap_http_header_filter(ap_filter_t * f=0x06b8d558, 
apr_bucket_brigade * b=0x06bad4c0)  Line 1692	C
 	libhttpd.dll!ap_pass_brigade(ap_filter_t * next=0x06b8d558, 
apr_bucket_brigade * bb=0x06bad4c0)  Line 512 + 0x10	C
 	libhttpd.dll!ap_content_length_filter(ap_filter_t * f=0x06b8d540, 
apr_bucket_brigade * b=0x06bad4c0)  Line 1233	C
 	libhttpd.dll!ap_pass_brigade(ap_filter_t * next=0x06b8d540, 
apr_bucket_brigade * bb=0x06bad4c0)  Line 512 + 0x10	C
 	libhttpd.dll!ap_byterange_filter(ap_filter_t * f=0x06b8d528, 
apr_bucket_brigade * bb=0x06bad4c0)  Line 2876 + 0x10	C
 	libhttpd.dll!ap_pass_brigade(ap_filter_t * next=0x06b8d528, 
apr_bucket_brigade * bb=0x06bad4c0)  Line 512 + 0x10	C
 	mod_perl.so!modperl_wbucket_pass()  + 0xf4	
 	mod_perl.so!modperl_wbucket_flush()  + 0x1b	
 	mod_perl.so!modperl_output_filter_handler()  + 0x28	
 	mod_perl.so!modperl_wbucket_pass()  + 0xf4	
 	mod_perl.so!modperl_wbucket_flush()  + 0x1b	
 	mod_perl.so!modperl_io_perlio_restore_stdout()  + 0xd0	


When error accurs, the return value of BIO_write is -1, which is different from 
s->s3->wbuf.left
			i=BIO_write(s->wbio,
				(char *)&(s->s3->wbuf.buf[s->s3->wbuf.offset]),
				(unsigned int)s->s3->wbuf.left);

Then the function returns with -1.

The attached file is the locals when reach this point.





then, in ssl_filter_write(),

then SSL_get_error is called:

int ssl_err = SSL_get_error(filter_ctx->pssl, res);
This is the call stack,
	ssleay32.dll!SSL_get_error(const ssl_st * s=0x06bfdd08, int i=-1)  Line 
1818	C
 	mod_ssl.so!ssl_filter_write(ap_filter_t * f=0x06b82620, const char * 
data=0x06dd0000, unsigned int len=346)  Line 776 + 0xf	C
 	mod_ssl.so!ssl_io_filter_output(ap_filter_t * f=0x06b82620, 
apr_bucket_brigade * bb=0x06b918a0)  Line 1364 + 0x11	C
 	libhttpd.dll!ap_pass_brigade(ap_filter_t * next=0x06b82620, 
apr_bucket_brigade * bb=0x06b918a0)  Line 512 + 0x10	C
 	libhttpd.dll!ap_http_header_filter(ap_filter_t * f=0x06bb0b20, 
apr_bucket_brigade * b=0x06b918a0)  Line 1692	C
 	libhttpd.dll!ap_pass_brigade(ap_filter_t * next=0x06bb0b20, 
apr_bucket_brigade * bb=0x06b918a0)  Line 512 + 0x10	C
 	libhttpd.dll!ap_content_length_filter(ap_filter_t * f=0x06bb0b08, 
apr_bucket_brigade * b=0x06b918a0)  Line 1233	C
 	libhttpd.dll!ap_pass_brigade(ap_filter_t * next=0x06bb0b08, 
apr_bucket_brigade * bb=0x06b918a0)  Line 512 + 0x10	C
 	libhttpd.dll!ap_byterange_filter(ap_filter_t * f=0x06bb0af0, 
apr_bucket_brigade * bb=0x06b918a0)  Line 2876 + 0x10	C
 	libhttpd.dll!ap_pass_brigade(ap_filter_t * next=0x06bb0af0, 
apr_bucket_brigade * bb=0x06b918a0)  Line 512 + 0x10	C
 	libhttpd.dll!default_handler(request_rec * r=0x06bafcd8)  Line 3610 + 
0x13	C
 	libhttpd.dll!ap_run_handler(request_rec * r=0x06bafcd8)  Line 153 + 0x4e
	C
 	libhttpd.dll!ap_invoke_handler(request_rec * r=0x06bafcd8)  Line 364 + 
0x9	C
 	libhttpd.dll!ap_process_request(request_rec * r=0x06bafcd8)  Line 249 + 
0x9	C
 	libhttpd.dll!ap_process_http_connection(conn_rec * c=0x06b82270)  Line 
251 + 0x9	C
 	libhttpd.dll!ap_run_process_connection(conn_rec * c=0x06b82270)  Line 
43 + 0x4e	C
 	libhttpd.dll!ap_process_connection(conn_rec * c=0x06b82270, void * 
csd=0x06b821a0)  Line 178	C
 	libhttpd.dll!worker_main(void * thread_num_val=0x000000f8)  Line 733
	C
 	msvcr71d.dll!_threadstartex(void * ptd=0x06b82098)  Line 241 + 0xd
	C
 	kernel32.dll!GetModuleFileNameA()  + 0x1b4	

In function SSL_get_error, the flow skips all if statements and hit the last 
return statement,
	return(SSL_ERROR_SYSCALL);
	
Then, in function ssl_filter_write, ap_log_error() will be called with a 
string "SSL output filter write failed.",
which is the error message you see in apache error log(debug mode).

        else if (ssl_err == SSL_ERROR_SYSCALL) {
            ap_log_error(APLOG_MARK, APLOG_INFO, outctx->rc, c->base_server,
                        "SSL output filter write failed.");
                        
The attached file is the locals when reach this point.

By the way, we configured and use the mod_perl filter functions. Once SSL 
generate that error message, mod_perl calls 
       MP_RUN_CROAK(modperl_output_filter_flush(filter),
                     "Apache2::Filter");
in modperl_run_filter(modperl_filter_t *filter) from modperl_filter.c, 
which kills apache child process.
Comment 4 Joe Orton 2005-09-24 16:37:05 UTC
I'm confused, which is the stack backtrace *when the process crashes*?  

You have given two stack traces but then described how the code continues from
there, so neither of those stack traces are actually triggered by the crash?

> By the way, we configured and use the mod_perl filter functions. Once SSL 
> generate that error message, mod_perl calls 
>       MP_RUN_CROAK(modperl_output_filter_flush(filter),
>                     "Apache2::Filter");
> in modperl_run_filter(modperl_filter_t *filter) from modperl_filter.c, 
> which kills apache child process.

so what is the stack trace *when the child process is killed*?
Comment 5 Shien Xie 2005-09-25 18:25:21 UTC
The first stacktrace is where SSL write operation fails.

The second stacktrace is where error code "SSL_ERROR_SYSCALL" is returned and 
where "SSL output filter write failed" is wrote into apache's error log.

The crash is triggered by mod_perl. Mod_perl calls MP_RUN_CROAK to kill apache 
child process when SSL returns that error code to it, which it can not 
understand. I will put more debug info once I get it.



Comment 6 Shien Xie 2005-09-25 18:34:37 UTC
So, neither of these two stacktrace is *when the process crashes*?  
I did not attach the stacktrace of that because I thought the root cause is 
why SSL returns that error code, not why mod_perl kills apache.

Anyway, I will paste the stacktrace *when the process crashes* on Monday.
Comment 7 Joe Orton 2005-09-25 18:57:58 UTC
Oh, I see what you're saying...

if a mod_perl "croak" means the whole process is terminated, then that much
really is a mod_perl bug.  Output filters may very well fail and return errors
under abnormal circumstances such as aborted connections.  

There is probably some separate mod_ssl error code handling issue here too w.r.t
the  error number conversion failure. 
Comment 8 Matthew Darwin 2005-09-25 19:01:58 UTC
from Carp(3perl) man page:

croak   - die of errors (from perspective of caller)
Comment 9 Shien Xie 2005-09-27 21:47:19 UTC
Created attachment 16540 [details]
backtrace info while mod_perl crashes Apache
Comment 10 Shien Xie 2005-09-27 21:58:29 UTC
If load SSL and mod_perl(use filter function) in same apache server, 
openssl returns "SSL output filter write failed" very frequently, (once every 5-
6 pages have been loaded by browser), it that a normal behavior of SSL?



Comment 11 Shien Xie 2005-09-27 22:01:01 UTC
It is likely that bug 36557 is about the same problem.
Comment 12 Joe Orton 2005-09-28 09:33:33 UTC
The fact that mod_perl aborts the process if an output filter fails is a
mod_perl bug; the mod_ssl behaviour is fine here, please report it following:

http://perl.apache.org/docs/2.0/user/help/help.html#Reporting_Problems
Comment 13 Joe Orton 2005-09-28 09:34:17 UTC
*** Bug 36557 has been marked as a duplicate of this bug. ***
Comment 14 Issac Goldstand 2005-09-28 12:37:25 UTC
I'm not sure if mod_perl is intentionally shutting down Apache's process or not...  
I think the problem is that croak ultimately calls exit (see dump file I
posted).    Usually - even in an embedded environment, this is what one expects
croak (or die) to do.  I think it may be problematic in an environment like
mod_perl2 where there are multiple interpreters living in the same process.  In
such a case, we don't want the entire process to die - I'd guess just the one
embedded interpreter (which would have to be replaced inside the child process).
 I'd also guess that this was noticed by Simon and myself on Win32 platforms
since unlike mod_worker which, although is threaded, spawns multiple child
processes, each containing worker threads, mpm_nt has only one child process, so
when it goes, we're stuck waiting until the child fully recycles... 

In any case, I'll follow-up on this on the mod_perl list.
Comment 15 Issac Goldstand 2005-09-28 12:39:04 UTC
Created attachment 16542 [details]
ms (vc2003) dump of perl calling exit(9)