Uploaded image for project: 'mod_python'
  1. mod_python
  2. MODPYTHON-112

If using filters value of req.phase only valid up till first req.read()/req.write().

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.4, 3.2.7
    • Fix Version/s: 3.3.1
    • Component/s: core
    • Labels:
      None

      Description

      The request object provides a member variable called 'phase' described as:

      The phase currently being being processed, e.g. "PythonHandler". (Read-Only)

      If no Python based input and output filters are used, the value of req.phase will be constant for the life of a request phase. If however you use an input or output filter, the value of req.phase can change.

      Consider that we are in the content handler phase and where there is a Python based output filter, but no Python based input filter. On initially entering the request handler, the value of req.phase will be "PythonHandler". As soon as req.write() is called however, the value of req.phase changes to "PythonOutputFilter".

      Now, if there is a Python based input filter, but no Python based output filter, the value of req.phase will change to "PythonInputFilter" as soon as req.read() is called.

      If there are both Python based input and output filters, the value of req.phase will in turn change to "PythonInputFilter" and "PythonOutputFilter" as req.read() and then req.write() are in turn called.

      The reason for all this is that in the get_request_object() function of src/mod_python.c, it contains code:

      /* make a note of which phase we are in right now */
      Py_XDECREF(request_obj->phase);
      if (phase)
      request_obj->phase = PyString_FromString(phase);
      else
      request_obj->phase = PyString_FromString("");

      That is, whenever called to get the request object, it will update req.phase. This will occur even if the request object had already been created, as will be the case when there is a Python based content handler and either a Python based input or output filter.

      Overall this behaviour is a bit strange and unexpected. It would seem to me that it if there is both a handler and a filter, that the value of req.phase would be left as the name of the handler phase. Ie., it would stay for example as "PythonHandler" and not be changed to "PythonInputFilter" or "PythonOutFilter".

      One can't just change the code in get_request_object() not to update it if already set, as it has to be updated when one moves from one phase to another. One has to contend with where this function is called in python_filter() function, namely:

      /* create/acquire request object */
      request_obj = get_request_object(req, interp_name,
      is_input?"PythonInputFilter":"PythonOutputFilter");

      First step may be simply not to pass in "PythonInputFilter" or "PythonOutputFilter" and instead just call it as:

      request_obj = get_request_object(req, interp_name,0);

      At the same time change get_request_object() to:

      Py_XDECREF(request_obj->phase);
      if (phase)
      request_obj->phase = PyString_FromString(phase);
      else if (!request_obj->phase)
      request_obj->phase = PyString_FromString("");

      Ie., result will be that if req.phase is set, leave it alone when called by python_filter() with phase set 0. If req.phase isn't already set, set it to the empty string.

      The consequences of this are that if there are filters but no Python based handlers, then req.phase will get set to an empty string in that case. Another strange case is that if the only Python based handler was for an early phase than what is consuming or generating the content to be processed, then req.phase will be set to a value corresponding to that earlier phase and not where the current action is. For example, if there was an AccessHandler but then content came from a static file, output filter would see "AccessHandler".

      Thus, whatever is done, there will be some strangeness. Thus, question remains of what should be done, or if it should be left that way and that documentation changed to say that req.phase is only valid up until first call to req.read() or req.write() within a handler.

      This is not an ideal situation though as a handler may want to interogate req.phase after req.read() or req.write() has been called for some reason, which would yield incorrect results if a filter is being used. An example of where this occurs is in error reporting in the HandlerDispatch of mod_python.apache itself. When it is generating an error message, the phase shown in the error message will be wrong if there was a filter. It should perhaps at least be changed to save away req.phase at start of dispatch so it knows it is correct later for any error messages.

      Any comments?????

        Activity

        Hide
        grahamd Graham Dumpleton added a comment -

        The implications of this problem could be significant where multiple handlers are specified for a single phase. Ie., as specified in Apache configuration:

        PythonHandler example1
        PythonHandler example2

        or if subsequent handlers added use req.add_handler().

        The reason for this is that the default handler name is calculated in HandlerDispatch() from req.phase. It does this calculation for each handler to be called instead of just once:

        while hlist.handler is not None:

        1. split module::handler
          l = hlist.handler.split('::', 1)

        module_name = l[0]
        if len(l) == 1:

        1. no oject, provide default
          object_str = req.phase[len("python"):].lower()
          else:
          object_str = l[1]

        If the handler example1::handler() were to call req.read()/req.write() where a Python based input/output filter is present, when the HandlerDispatch() calculates object_str for the second handler, it will try and call "inputfilter" or "outputfilter" instead of "handler".

        This is again solved by caching phase early on:

        phase = req.phase
        default_handler = phase[len("python"):].lower()

        while hlist.handler is not None:

        1. split module::handler
          l = hlist.handler.split('::', 1)

        module_name = l[0]
        if len(l) == 1:

        1. no oject, provide default
          object_str = default_handler
          else:
          object_str = l[1]

        The latter except blocks can then be changed also to not use req.phase.

        except PROG_TRACEBACK, traceblock: # Program run-time error
        try:
        (etype, value, traceback) = traceblock
        result = self.ReportError(etype, value, traceback, req=req,
        phase=phase, hname=hlist.handler,
        debug=debug)
        finally:
        traceback = None

        except:

        1. Any other rerror (usually parsing)
          try:
          exc_type, exc_value, exc_traceback = sys.exc_info()
          result = self.ReportError(exc_type, exc_value, exc_traceback, req=req,
          phase=phase, hname=hlist.handler, debug=debug)
        Show
        grahamd Graham Dumpleton added a comment - The implications of this problem could be significant where multiple handlers are specified for a single phase. Ie., as specified in Apache configuration: PythonHandler example1 PythonHandler example2 or if subsequent handlers added use req.add_handler(). The reason for this is that the default handler name is calculated in HandlerDispatch() from req.phase. It does this calculation for each handler to be called instead of just once: while hlist.handler is not None: split module::handler l = hlist.handler.split('::', 1) module_name = l [0] if len(l) == 1: no oject, provide default object_str = req.phase [len("python"):] .lower() else: object_str = l [1] If the handler example1::handler() were to call req.read()/req.write() where a Python based input/output filter is present, when the HandlerDispatch() calculates object_str for the second handler, it will try and call "inputfilter" or "outputfilter" instead of "handler". This is again solved by caching phase early on: phase = req.phase default_handler = phase [len("python"):] .lower() while hlist.handler is not None: split module::handler l = hlist.handler.split('::', 1) module_name = l [0] if len(l) == 1: no oject, provide default object_str = default_handler else: object_str = l [1] The latter except blocks can then be changed also to not use req.phase. except PROG_TRACEBACK, traceblock: # Program run-time error try: (etype, value, traceback) = traceblock result = self.ReportError(etype, value, traceback, req=req, phase=phase, hname=hlist.handler, debug=debug) finally: traceback = None except: Any other rerror (usually parsing) try: exc_type, exc_value, exc_traceback = sys.exc_info() result = self.ReportError(exc_type, exc_value, exc_traceback, req=req, phase=phase, hname=hlist.handler, debug=debug)
        Hide
        grahamd Graham Dumpleton added a comment -

        Attached patch "grahamd_20060223_1.diff".

        Would like confirmation on final solution for this issue before I commit changes.

        Behaviour would be such that req.phase is always set to be the phase of the currently executing handler. If an input or output filter is triggered it will no longer overwrite the value of req.phase. Thus, if in the content handler, req.phase will stay as "PythonHandler" for the whole time the handler is executing.

        If there is no mod_python content handler and just a mod_python input or output filter, accessing filter.req.phase from the filter will result in the Python None value. This will be the case even if a prior phase to the content handler executed as a mod_python handler, eg. "PythonFixupHandler", as req.phase will be cleared at the end of any mod_python handler phase.

        That is, at no time will req.phase be set to be either PythonInputFilter or PythonOutputFilter like it used to. Whether a filter is running in input or output mode is determinable from filter.is_input so no need for filter directive name to be put in req.phase. This does not cause any problems for CallBack.FilterDispatch() in apache.py as it didn't use req.phase, but used filter.is_input where necessary. The change to CallBack.HandlerDispatch() in apache.py to cache the phase the handler is executing in as described previously for this issue, is no longer necessary with the way this fix works.

        Show
        grahamd Graham Dumpleton added a comment - Attached patch "grahamd_20060223_1.diff". Would like confirmation on final solution for this issue before I commit changes. Behaviour would be such that req.phase is always set to be the phase of the currently executing handler. If an input or output filter is triggered it will no longer overwrite the value of req.phase. Thus, if in the content handler, req.phase will stay as "PythonHandler" for the whole time the handler is executing. If there is no mod_python content handler and just a mod_python input or output filter, accessing filter.req.phase from the filter will result in the Python None value. This will be the case even if a prior phase to the content handler executed as a mod_python handler, eg. "PythonFixupHandler", as req.phase will be cleared at the end of any mod_python handler phase. That is, at no time will req.phase be set to be either PythonInputFilter or PythonOutputFilter like it used to. Whether a filter is running in input or output mode is determinable from filter.is_input so no need for filter directive name to be put in req.phase. This does not cause any problems for CallBack.FilterDispatch() in apache.py as it didn't use req.phase, but used filter.is_input where necessary. The change to CallBack.HandlerDispatch() in apache.py to cache the phase the handler is executing in as described previously for this issue, is no longer necessary with the way this fix works.

          People

          • Assignee:
            grahamd Graham Dumpleton
            Reporter:
            grahamd Graham Dumpleton
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development