mod_python
  1. mod_python
  2. MODPYTHON-109

Signal handler calling Py_Finalize() when child processes being killed.

    Details

    • Type: Bug Bug
    • Status: Reopened
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.2.7
    • Fix Version/s: None
    • Component/s: core
    • Labels:
      None

      Description

      When Apache is killing off child processes as part of actions taken when the "apachectl restart" or "apachectl graceful" command is run, it sends a SIGTERM signal to the child processes. This causes a signal handler registered by Apache to be run. That signal handler destroys the main child memory pool. That memory pool has though a cleanup handler associated with it which was registered by mod_python. That cleanup handler ultimately calls Py_Finalize().

      The problem with this is that Py_Finalize() isn't safe to be called from a signal handler and if a handler is still executing or there is a separate thread running in the context of Python, a deadlock will likely ensue. This will prevent the child process exiting due to the SIGTERM causing the Apache parent process to send it a SIGKILL to really kill it.

      For a more detailed assessment of the problem and what lead to this conclusion see:

      http://www.modpython.org/pipermail/mod_python/2006-January/019865.html
      http://www.modpython.org/pipermail/mod_python/2006-January/019866.html
      http://www.modpython.org/pipermail/mod_python/2006-January/019870.html

      To avoid the problem, the only choice seems to be avoid calling Py_Finalize() from the signal handler. The simplistic way of doing this seems to be to add:

      if (child_init_pool)
      return APR_SUCCESS;

      at the start of python_finalize(). This will mean that Py_Finalize() is never called in child processes. The full consequences of this is unknown, but on face value it would seem that it might be a reasonable thing to do. More research may be required.

        Activity

        Graham Dumpleton created issue -
        Hide
        Graham Dumpleton added a comment -

        The problem is actually bigger than just Py_Finalize(). When a server cleanup handler is registered using either apache.register_cleanup() or req.server.register_cleanup(), the execution of that handler is also registered to be called by associating it with the destruction of the main child memory pool. This means that such server cleanup handlers are also going to be executed from the context of a signal handler and thus are going to be a problem. It may be necessary to remove the feature of registering server cleanup handlers as there seems to be no other way of doing the same thing.

        Show
        Graham Dumpleton added a comment - The problem is actually bigger than just Py_Finalize(). When a server cleanup handler is registered using either apache.register_cleanup() or req.server.register_cleanup(), the execution of that handler is also registered to be called by associating it with the destruction of the main child memory pool. This means that such server cleanup handlers are also going to be executed from the context of a signal handler and thus are going to be a problem. It may be necessary to remove the feature of registering server cleanup handlers as there seems to be no other way of doing the same thing.
        Graham Dumpleton made changes -
        Field Original Value New Value
        Assignee Graham Dumpleton [ grahamd ]
        Graham Dumpleton made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        Hide
        Graham Dumpleton added a comment -

        Better fix for this if calling of Py_Finalize() is to be eliminated would be simply not to register python_finalize() function against the child init pool. This would be done by deleting line:

        apr_pool_cleanup_register(p, NULL, python_finalize, apr_pool_cleanup_null);

        from PythonChildInitHandler() function. The whole python_finalize() function would then no longer be needed and could also be deleted.

        If the functionality of registering Python based server cleanup functions were to be removed, the server register_cleanup() functions should probably be left in place as stubs. That is, they would no longer do anything. They should though log a message indicating that they are now deprecated. By not removing them completely, existing code will at least still run. Given that the cleanup functions would not have been reliably called in the past and couldn't be relied upon, it shouldn't adversely affect any actual web application that used them.

        Feedback on this issue is needed. Any decision about what is done should be made by all core developers.

        Show
        Graham Dumpleton added a comment - Better fix for this if calling of Py_Finalize() is to be eliminated would be simply not to register python_finalize() function against the child init pool. This would be done by deleting line: apr_pool_cleanup_register(p, NULL, python_finalize, apr_pool_cleanup_null); from PythonChildInitHandler() function. The whole python_finalize() function would then no longer be needed and could also be deleted. If the functionality of registering Python based server cleanup functions were to be removed, the server register_cleanup() functions should probably be left in place as stubs. That is, they would no longer do anything. They should though log a message indicating that they are now deprecated. By not removing them completely, existing code will at least still run. Given that the cleanup functions would not have been reliably called in the past and couldn't be relied upon, it shouldn't adversely affect any actual web application that used them. Feedback on this issue is needed. Any decision about what is done should be made by all core developers.
        Hide
        Jim Gallacher added a comment -

        Just how unreliable is the use of server.register_cleanup? I wonder how many people may be using it as is, and accepting that it works imperfectly?

        Show
        Jim Gallacher added a comment - Just how unreliable is the use of server.register_cleanup? I wonder how many people may be using it as is, and accepting that it works imperfectly?
        Hide
        Graham Dumpleton added a comment -

        It will be random as to whether a registered cleanup handler in this situation would be able to run, let alone work.

        Ignoring the fact that you are doing complicated stuff in a signal handler, which is nearly always bad, the simple fact is that if a normal request handler is in the middle of handling a request and is thus holding the GIL when the parent Apache process decides to send the signal, the cleanup handler will block waiting to acquire the GIL.

        On UNIX systems at least where a signal handler causes main thread and any other threads to stop for the period the signal handler is run (not sure about Win32 where a signal handler is a distinct thread) , the request handler will never be able to release the GIL so the cleanup handler can't even begin to run.

        Thus if you are lucky, the cleanup handler will not block and will run, but even then the code is being run from a signal handler which will cause problems in itself and may result in the process crashing.

        All very dodgy.

        Show
        Graham Dumpleton added a comment - It will be random as to whether a registered cleanup handler in this situation would be able to run, let alone work. Ignoring the fact that you are doing complicated stuff in a signal handler, which is nearly always bad, the simple fact is that if a normal request handler is in the middle of handling a request and is thus holding the GIL when the parent Apache process decides to send the signal, the cleanup handler will block waiting to acquire the GIL. On UNIX systems at least where a signal handler causes main thread and any other threads to stop for the period the signal handler is run (not sure about Win32 where a signal handler is a distinct thread) , the request handler will never be able to release the GIL so the cleanup handler can't even begin to run. Thus if you are lucky, the cleanup handler will not block and will run, but even then the code is being run from a signal handler which will cause problems in itself and may result in the process crashing. All very dodgy.
        Hide
        Graham Dumpleton added a comment -

        Attached "MP109_20060308_grahamd_1.diff" containing proposed changes.

        All this does is prevent python_finalize() being called. Nothing done about register_cleanup() functions at this time. Patches also fix up incorrect test harness and documentation.

        Show
        Graham Dumpleton added a comment - Attached "MP109_20060308_grahamd_1.diff" containing proposed changes. All this does is prevent python_finalize() being called. Nothing done about register_cleanup() functions at this time. Patches also fix up incorrect test harness and documentation.
        Graham Dumpleton made changes -
        Attachment MP109_20060308_grahamd_1.diff [ 12323922 ]
        Graham Dumpleton made changes -
        Status In Progress [ 3 ] Open [ 1 ]
        Hide
        Graham Dumpleton added a comment -

        Mark to be addressed in 3.3. Specifically, unless good reasons are found not to, the apache.register_cleanup() and req.server.register_cleanup() methods will be deprecated and implementations stubbed out to log a message to Apache error log indicate they are no longer supported. The current code is just too unreliable, can't be guaranteed to run and causes problems in shutdown of Apache child processes.

        Show
        Graham Dumpleton added a comment - Mark to be addressed in 3.3. Specifically, unless good reasons are found not to, the apache.register_cleanup() and req.server.register_cleanup() methods will be deprecated and implementations stubbed out to log a message to Apache error log indicate they are no longer supported. The current code is just too unreliable, can't be guaranteed to run and causes problems in shutdown of Apache child processes.
        Graham Dumpleton made changes -
        Fix Version/s 3.3 [ 12310101 ]
        Hide
        Graham Dumpleton added a comment -

        After more digging on this issue my opinion is now that we should just leave the register_cleanup() functions as is. That is don't allow Py_Finalize() to be called, but with the register_cleanup() functions being left in place. If it is shown somehow that the area is definitely a problem, we can revisit it then, but right now all I can say is that I can't set up test cases which show an issue.

        The implications of not calling Py_Finalize() are that any functions registered with the Python atexit module will not be called and Python objects will not be destroyed. The atexit limitation isn't a big deal as only functions which were registered in the main_interpreter (after MODPYTHON-77 changes) were called anyway, and the register_cleanup() functions should be used instead.

        More importantly, by not calling Py_Finalize() you don't start destroying all your interpreters while there are potentially user created threads still running. If one does call it and threads are still running, they can start crashing with Python exceptions. These wouldn't probably be noticed as stderr at that point probably isn't going to go anywhere, but when you catch exceptions and log them to the Apache error log you do see them. Thus, probably better to just let the threads run with a full environment up to the very point that the process it exited.

        Consequently, am going to mark this as resolved.

        Show
        Graham Dumpleton added a comment - After more digging on this issue my opinion is now that we should just leave the register_cleanup() functions as is. That is don't allow Py_Finalize() to be called, but with the register_cleanup() functions being left in place. If it is shown somehow that the area is definitely a problem, we can revisit it then, but right now all I can say is that I can't set up test cases which show an issue. The implications of not calling Py_Finalize() are that any functions registered with the Python atexit module will not be called and Python objects will not be destroyed. The atexit limitation isn't a big deal as only functions which were registered in the main_interpreter (after MODPYTHON-77 changes) were called anyway, and the register_cleanup() functions should be used instead. More importantly, by not calling Py_Finalize() you don't start destroying all your interpreters while there are potentially user created threads still running. If one does call it and threads are still running, they can start crashing with Python exceptions. These wouldn't probably be noticed as stderr at that point probably isn't going to go anywhere, but when you catch exceptions and log them to the Apache error log you do see them. Thus, probably better to just let the threads run with a full environment up to the very point that the process it exited. Consequently, am going to mark this as resolved.
        Graham Dumpleton made changes -
        Resolution Fixed [ 1 ]
        Status Open [ 1 ] Resolved [ 5 ]
        Hide
        Graham Dumpleton added a comment -

        Am reopening this issue as due to work on mod_wsgi I believe I have discovered that problems with this are nothing to do with signal handlers.

        The first source of Python exceptions that were seen when process was being shutdown were due to the fact that PyFinalize() was not being called from what the main Python interpreter's 'threading' module saw as being a valid Python thread, ie., it expected it to be the main thread for the interpreter, when in fact it was a foreign thread created outside of Python. The consequence of this was that when Py_Finalize() was called and it triggered sys.exitfunc() to call any atexit registered callbacks, a callback registered by the 'threading' module was called and because the thread ID wasn't in the table of Python threads, it threw a Python exception.

        The next problem is with sub interpreters other than the main Python interpreter. The issue here was that Py_Finalize() only calls sys.exitfunc() on the main Python interpreter. As a consequence, any thread cleanup is not done in sub interpreters when they are being destroyed and depending on what a thread was doing, this could cause the process to crash.

        What should happen on process shutdown is that it should iterate over all interpreters, create a thread state against each interpreter in turn, call 'threading.currentThread()' to ensure that 'threading' creates a dummy thread object for that thread state, then call sys.exitfunc(). After having done that, then it can call Py_Finalize().

        By ensuring that exit funcs are called, means that atexit.register() then becomes an alternate way to register functions to be called on process shutdown, which is a much better option for generic modules than trying to hook into apache.register_cleanup().

        Show
        Graham Dumpleton added a comment - Am reopening this issue as due to work on mod_wsgi I believe I have discovered that problems with this are nothing to do with signal handlers. The first source of Python exceptions that were seen when process was being shutdown were due to the fact that PyFinalize() was not being called from what the main Python interpreter's 'threading' module saw as being a valid Python thread, ie., it expected it to be the main thread for the interpreter, when in fact it was a foreign thread created outside of Python. The consequence of this was that when Py_Finalize() was called and it triggered sys.exitfunc() to call any atexit registered callbacks, a callback registered by the 'threading' module was called and because the thread ID wasn't in the table of Python threads, it threw a Python exception. The next problem is with sub interpreters other than the main Python interpreter. The issue here was that Py_Finalize() only calls sys.exitfunc() on the main Python interpreter. As a consequence, any thread cleanup is not done in sub interpreters when they are being destroyed and depending on what a thread was doing, this could cause the process to crash. What should happen on process shutdown is that it should iterate over all interpreters, create a thread state against each interpreter in turn, call 'threading.currentThread()' to ensure that 'threading' creates a dummy thread object for that thread state, then call sys.exitfunc(). After having done that, then it can call Py_Finalize(). By ensuring that exit funcs are called, means that atexit.register() then becomes an alternate way to register functions to be called on process shutdown, which is a much better option for generic modules than trying to hook into apache.register_cleanup().
        Graham Dumpleton made changes -
        Status Resolved [ 5 ] Reopened [ 4 ]
        Resolution Fixed [ 1 ]
        Hide
        Graham Dumpleton added a comment -

        In Python 2.5.1c1 they have changed how the 'threading' module makes use of the atexit module to shutdown threading. Instead of using the atexit module, the C code Python main routine now explicitly calls 'threading._shutdown()' instead. Any changes to mod_python will thus have to call 'threading._shutdown()' if it exists and then call 'sys.exitfunc()'.

        Show
        Graham Dumpleton added a comment - In Python 2.5.1c1 they have changed how the 'threading' module makes use of the atexit module to shutdown threading. Instead of using the atexit module, the C code Python main routine now explicitly calls 'threading._shutdown()' instead. Any changes to mod_python will thus have to call 'threading._shutdown()' if it exists and then call 'sys.exitfunc()'.
        Graham Dumpleton made changes -
        Fix Version/s 3.3.1 [ 12312496 ]

          People

          • Assignee:
            Graham Dumpleton
            Reporter:
            Graham Dumpleton
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development