1. Create a httpd.conf with 1500 websites. Every VirtualHost section should be like this: <VirtualHost xxxx> ServerName testXXX ServerAlias testXXX.com ServerAdmin webmaster@testXXX.com DocumentRoot /home/testXXX/public_html ScriptAlias /cgi-local/ /home/testXXX/public_html/cgi-local/ SetEnv LOG_SUBDIR testXXX SuexecUserGroup testXXX testXXX JkMount /*.jsp ajp13 JkMount /servlet/* ajp13 </VirtualHost> <VirtualHost xxxx:443> ServerName testXXX ServerAlias testXXX.com ServerAdmin webmaster@testXXX.com DocumentRoot /home/testXXX/public_html ScriptAlias /cgi-local/ /home/testXXX/public_html/cgi-local/ SetEnv LOG_SUBDIR testXXX SuexecUserGroup testXXX testXXX JkMount /*.jsp ajp13 JkMount /servlet/* ajp13 </VirtualHost> 2. create a simple script: #!/usr/bin/perl print "Content-Type: text/html\n\n"; print "Test!!"; 3. execute on the web Actual results: In a Dual Xeon, zero load, it takes 2 seconds to run. I did some straces: strace -T -f -p xxxxx [pid 30864] read(4, "", 4096) = 0 <0.000022> [pid 30864] close(4) = 0 <0.000016> [pid 30864] read(3, "", 4096) = 0 <0.000014> [pid 30864] close(3) = 0 <0.000012> [pid 30864] write(1, "Content-Type: text/html\n\nTest"..., 72) = 72 <0.000020> [pid 30785] <... poll resumed> [{fd=26, events=POLLIN, revents=POLLIN}, {fd=28, events=POLLIN}], 2, 30000) = 1 <2.120682> In "LogLevel debug", I see: [Fri Nov 11 12:55:50 2005] [25756:39072] [debug] jk_cleanup_shmem::mod_jk.c (1761): Shmem cleanup [Fri Nov 11 12:55:53 2005] [25756:39072] [debug] jk_cleanup_shmem::mod_jk.c (1761): Shmem cleanup (> 2 seconds !!) If I remove all the JkMount mapping, the CGI execution is very fast. So there is definitely an issue with mod_jk. with mod_jk -> more than 2 seconds without mod_jk -> a few milliseconds OBS: Those tests were done in a server without load. I simply turn apache 2.0.52 on and requested the CGI execution. I can test any patch.
What version of mod_jk and what Apache version (httpd -V) as well as OS. The messages you see are caused by recycling the child process so that's why they took 2 seconds. The question is why the child dies, so please provide more info about apache version, mpm used and OS.
I´m using RedHat AS 3.0 httpd -V Server version: Apache/2.0.46 Server built: Sep 6 2005 11:21:32 Server's Module Magic Number: 20020903:4 Architecture: 32-bit Server compiled with.... -D APACHE_MPM_DIR="server/mpm/prefork" -D APR_HAS_SENDFILE -D APR_HAS_MMAP -D APR_HAVE_IPV6 (IPv4-mapped addresses disabled) -D APR_USE_SYSVSEM_SERIALIZE -D APR_USE_PTHREAD_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D HTTPD_ROOT="/etc/httpd" -D SUEXEC_BIN="/usr/sbin/suexec" -D DEFAULT_PIDLOG="logs/httpd.pid" -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" -D DEFAULT_LOCKFILE="logs/accept.lock" -D DEFAULT_ERRORLOG="logs/error_log" -D AP_TYPES_CONFIG_FILE="conf/mime.types" -D SERVER_CONFIG_FILE="conf/httpd.conf" Latest mod_jk - 1.2.15 (I tested on 1.2.14 and it´s the same) Let me know if you need something else.
Can you post your workers.properties file and global JkXXXX directives. Also, set the 'JkLogLevel debug' to httpd.conf, and post the mod_jk.log for: > apachectrl start > apachectrl stop One other question, is it working with 500 vhosts?
Created attachment 16949 [details] debug file for mod_jk
workers.properties, only the uncomment part: worker.list=ajp13 worker.ajp13.port=8009 worker.ajp13.host=localhost worker.ajp13.type=ajp13 worker.ajp13.lbfactor=1 worker.ajp13.cachesize=10 worker.ajp13.cache_timeout=600 worker.ajp13.socket_keepalive=0 worker.ajp13.socket_timeout=300 It works with 1500 ;)). But then CGI works very slowly as described in this issue. I´m attaching the debug log. I turned the server on, executed the cgi script and the turned it off.
Your log statements look inconsistent to me: [pid 30864] write(1, "Content-Type: text/html\n\nTest"..., 72) = 72 <0.000020> So PID 30864 correctly writes your CGI output after a very short time. The next entry [pid 30785] <... poll resumed> [{fd=26, events=POLLIN, revents=POLLIN}, {fd=28, events=POLLIN}], 2, 30000) = 1 <2.120682> comes from a totally different process 30785 and for me it does NOT mean, that the request took 2 seconds. [Fri Nov 11 12:55:50 2005] [25756:39072] [debug] jk_cleanup_shmem::mod_jk.c (1761): Shmem cleanup [Fri Nov 11 12:55:53 2005] [25756:39072] [debug] jk_cleanup_shmem::mod_jk.c (1761): Shmem cleanup These two messages come from PID 25756 which is again another process. Furthermore the PID is numerically very different from the PIDs above, so it seems that this process has not been started/forked around the same time as the other two ones. You can add %P to apaches log format to log the PID which processed the request and %T to get the request duration in seconds. Nevertheless I find it strange, that the same PID and thread id logs a cleanup two times.
I just looked at your mod_jk.log. Again the PID of the process doing two cleanups in a row is not contained in the startup log lines. I guess it's the forked CGI process, that does the jk cleanup. I think that's not to be expected - Mladen, what do you think?
The log I send is different from the first post (I did another test when you asked). Also when I did the strace, I had to choose a child process and then hit the web a couple of times, till it uses the one I chose. Sorry, if it looks inconsistent. If you want I can try to provide a log with the right PIDs, but they will look exactly as what I posted. Thanks again.
25756 is the forked CGI process. On the test perl script change print line to: print "Test - PID process $$ !!"; and you´ll see that this is the process that calls jk_cleanup_shmem (!? - weird...)
*** Bug 37470 has been marked as a duplicate of this bug. ***
Fixed in the SVN. See: http://svn.apache.org/viewcvs?rev=386604&view=rev