System, Sun Netra T1, Solaris 7, Apache 2.0.45 with worker MPM. When stress testing a binary cgi the mod_cgid daemon would die. After it had died Apache would detect and log that it had died with this message in the error_log, "No such process: cgid daemon is gone; is Apache terminating?". It does not attempt to restart the cgi daemon process again so all subsequent CGI requests fail. Apache 2 with the worker MPM for CGI would be more robust if mod_cgid could restart its child daemon process if it dies.
I have done some more research. When the cgid_daemon dies I see the following error message, the pid is for the cgid_daemon: [Wed May 28 11:05:59 2003] [notice] child pid 24470 exit signal Broken pipe (13) I added some debug logging to the cgid_maint function in mod_cgid.c. When the cgid_daemon dies with a broken pipe cgid_maint logged the following: [Wed May 28 11:05:59 2003] [error] cgid_maint [Wed May 28 11:05:59 2003] [error] cgid_maint APR_OC_REASON_DEATH [Wed May 28 11:05:59 2003] [error] cgid_maint [Wed May 28 11:05:59 2003] [error] cgid_maint APR_OC_REASON_UNREGISTER [Wed May 28 11:05:59 2003] [error] cgid_maint DONE [Wed May 28 11:05:59 2003] [error] cgid_maint DONE The code for doing a graceful restart is triggered when APR_OC_REASON_LOST is the reason, but this is never received by cgid_maint. Here are all the log entries when I reproduced this: [Wed May 28 11:05:12 2003] [notice] Apache/2.0.45 (Unix) mod_ssl/2.0.45 OpenSSL/0.9.7a mod_jk/1.2.3 configured -- resuming normal operations [Wed May 28 11:05:12 2003] [info] Server built: May 15 2003 21:22:19 [Wed May 28 11:05:57 2003] [error] [client 207.160.133.9] Premature end of script headers: counterdate.cgi [Wed May 28 11:05:57 2003] [error] [client 207.160.133.9] unable to include "/cgi/counterdate.cgi" in parsed file /export/home/moxp/www/www/index.html [Wed May 28 11:05:57 2003] [info] (32)Broken pipe: core_output_filter: writing data to the network [Wed May 28 11:05:58 2003] [info] (32)Broken pipe: core_output_filter: writing data to the network [Wed May 28 11:05:58 2003] [info] (32)Broken pipe: core_output_filter: writing data to the network [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] Premature end of script headers: counter.cgi [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] unable to include "/cgi/counter.cgi" in parsed file /export/home/moxp/www/www/index.html [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] Premature end of script headers: counterdate.cgi [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] unable to include "/cgi/counterdate.cgi" in parsed file /export/home/moxp/www/www/index.html [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] Premature end of script headers: counterdate.cgi [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] unable to include "/cgi/counterdate.cgi" in parsed file /export/home/moxp/www/www/index.html [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] Premature end of script headers: counterdate.cgi [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] unable to include "/cgi/counterdate.cgi" in parsed file /export/home/moxp/www/www/index.html [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] Premature end of script headers: counterdate.cgi [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] unable to include "/cgi/counterdate.cgi" in parsed file /export/home/moxp/www/www/index.html [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] Premature end of script headers: counter.cgi [Wed May 28 11:05:58 2003] [error] [client 207.160.133.9] unable to include "/cgi/counter.cgi" in parsed file /export/home/moxp/www/www/index.html [Wed May 28 11:05:59 2003] [notice] child pid 24470 exit signal Broken pipe (13) [Wed May 28 11:05:59 2003] [error] cgid_maint [Wed May 28 11:05:59 2003] [error] cgid_maint APR_OC_REASON_DEATH [Wed May 28 11:05:59 2003] [error] cgid_maint [Wed May 28 11:05:59 2003] [error] cgid_maint APR_OC_REASON_UNREGISTER [Wed May 28 11:05:59 2003] [error] cgid_maint DONE [Wed May 28 11:05:59 2003] [error] cgid_maint DONE [Wed May 28 11:05:59 2003] [error] [client 207.160.133.9] (3)No such process: cgid daemon is gone; is Apache terminating?: /export/home/moxp/www/cgi/counter.cgi
Attached is a patch which will restart the cgid daemon if it dies. At first I tried doing a server restart like the cgid_maint code had originally been setup to do, ( kill(getpid(), AP_SIG_GRACEFUL); ) but after an apache restart from cgid_maint I saw a number of the following warning messages. cgid worked fine though. [Wed May 28 10:49:42 2003] [warn] long lost child came home! (pid 21018) [Wed May 28 10:49:42 2003] [warn] long lost child came home! (pid 21019) So I wrote a patch that would just restart the cgid daemon rather than restart apache itself. This patch seems to work fine.
Created attachment 6553 [details] patch to restart cgid daemon if it dies
A final note. We started upgrading some of our production Sun Solaris servers to Apache 2 seven weeks ago Two of those servers have had the cgid daemon die at once during the seven week period. This resulted in cgi's failing until our customers notified us of the problem. We then had to restart or stop/start apache. The patch I sumbitted will automatically restart the cgid daemon so that only a few cgi requests fail rather coninuous failure until restart or stop/start of apache.
Thanks for your patch, and hopefully it will get reviewed/committed soon. I wanted to mention that if you're running into any problems with cgid, you need to apply this patch: http://cvs.apache.org/viewcvs.cgi/httpd-2.0/modules/generators/mod_cgid.c.diff?r1=1.150&r2=1.151 That fixes a simple bug with horrendous consequences, including the murder of the cgid daemon process (with the sigpipe in the library).
patch committed, thanks!!!!! I'll propose it for merging into the stable branch (2.0.47-dev).
*** Bug 22483 has been marked as a duplicate of this bug. ***
*** Bug 23533 has been marked as a duplicate of this bug. ***