|
BTW, I would prefer for the semaphore type to be configurable in the Apache configuration; then I wouldn't have to rebuild my distribution's mod_python packages to use them with the ITK MPM.
If you'd accept such a patch then I'd be happy to try and implement it. A configuration option can possibly be modelled off how the AcceptMutex directive for Apache works. Ie:
http://httpd.apache.org/docs/2.2/mod/mpm_common.html#acceptmutex Would just need to decide whether we do it as a PythonOption or introduce a new directive instead. Note there is an interesting comment in there about using pthread which may be relevant to the problem you are seeing. Specifically: On most systems, when the pthread option is selected, if a child process terminates abnormally while holding the AcceptCntl mutex the server will stop responding to requests. When this occurs, the server will require a manual restart to recover. Solaris is a notable exception as it provides a mechanism, used by Apache, which usually allows the mutex to be recovered after a child process terminates abnormally while holding a mutex. If your system implements the pthread_mutexattr_setrobust_np() function, you may be able to use the pthread option safely. We seem to be going pretty far down the road with PythonOption and our new namespace, so I'm inclined to stick with that, unless there is some sort of performance implication. Jim I did some investigating today and it seems that a pthread_mutexattr_setrobust_np function was introduced in glibc 2.4, so mod_python could check for it at build time.
My platform (Debian) will stick with glibc 3.2 until the release of Debian 4.0 ("etch"), so I'm inclined to add a PythonOption for this anyway. mod_python should perhaps simply be using 'ap_accept_lock_mech' instead of 'APR_LOCK_DEFAULT'.
The value of 'ap_accept_lock_mech' is more likely by default to be correct for what type of MPM is being used, if not, it can be overridden using the existing AcceptMutex directive, which if 'APR_LOCK_DEFAULT' is wrong for the MPM being used would have to be done to get the MPM to work anyway. |
||||||||||||||||||||||||||||||||||||||||||||||
The child processes are never killed by the parent server, and any further web requests will also hang while trying to grab the session sempahore. This remains the case even if I manually kill all the child processes (I guess the semaphore remains grabbed forever in the parent process).
I rebuilt with the APR_LOCK_FCNTL type semaphore, and I can now hammer refresh as fast as possible (intervals of ~0.08 seconds) without seeing any deadlocks. I don't know whether this is because the PROC_PTHREAD semaphores are buggy, or whether the FCNTL semaphores are actually faster (fast enough to avoid the deadlock).