Uploaded image for project: 'mod_python'
  1. mod_python
  2. MODPYTHON-129

HandlerDispatch doesn't treat OK/DECLINED result properly for all phases.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.2.7
    • Fix Version/s: 3.3.1
    • Component/s: core
    • Labels:
      None

      Description

      Todays daily bug report, or is it?

      The Python*Handler documentation says:

      """Multiple handlers can be specified on a single line, in which case they will be called sequentially, from left to right. Same handler directives can be specified multiple times as well, with the same result - all handlers listed will be executed sequentially, from first to last. If any handler in the sequence returns a value other than apache.OK, then execution of all subsequent handlers is aborted."""

      That is, no matter which phase is being processed, mod_python will stop processing them if a value other than OK is returned.

      Problem is that this isn't how Apache itself treats the result from handlers. Apache actually implements two different ways for dealing with the result from the handlers. Which is used depends on which processing phase is occuring. This is all specified by the Apache magic macro code:

      AP_IMPLEMENT_HOOK_RUN_FIRST(int,translate_name,
      (request_rec *r), (r), DECLINED)

      AP_IMPLEMENT_HOOK_RUN_FIRST(int,map_to_storage,
      (request_rec *r), (r), DECLINED)

      AP_IMPLEMENT_HOOK_RUN_FIRST(int,check_user_id,
      (request_rec *r), (r), DECLINED)

      AP_IMPLEMENT_HOOK_RUN_FIRST(int,auth_checker,
      (request_rec *r), (r), DECLINED)

      AP_IMPLEMENT_HOOK_RUN_ALL(int,access_checker,
      (request_rec *r), (r), OK, DECLINED)

      AP_IMPLEMENT_HOOK_RUN_FIRST(int,type_checker,
      (request_rec *r), (r), DECLINED)

      AP_IMPLEMENT_HOOK_RUN_ALL(int,fixups,
      (request_rec *r), (r), OK, DECLINED)

      What this gobblegook expands to are loops which will stop processing handlers based on the result.

      For the AP_IMPLEMENT_HOOK_RUN_ALL macro, all handlers in the phase will be run unless one returns something other than OK or DECLINED. Returning OK means that it did something and it worked okay. Returing DECLINED means that it didn't do anything at all. In both these cases, it still goes onto the next handler in that phase. After that it will go onto the next phase.

      Returning an error will cause appropriate error response to go back to client with any other handlers in the phase, as well as later phases being skipped. Returning DONE is much like returning an error but Apache interprets it as meaning a complete response was constructed and that it doesn't have to generate any response.

      For the AP_IMPLEMENT_HOOK_RUN_FIRST macro, all handlers will be run only if they all return DECLINED. In other words, if a handler returns OK it will skip the following handlers in that phase and then move onto the next phase. Returning an error or DONE is like above.

      In the case of mod_python, what it does doesn't fit into either. It is closer to behaving like the AP_IMPLEMENT_HOOK_RUN_ALL macro except that it stops processing further handlers in the phase if DECLINED is returned.

      As to what problems this causes, imagine you had registered multiple authentication handlers which supported different authentication mechanisms. This is the case where AP_IMPLEMENT_HOOK_RUN_FIRST macro is used. The idea is that each authentication handler would check the value associated with the AuthType directive to determine if it should do anything. If it was not the AuthType it implements, if it were a C based handler module, it would returned DECLINED to indicate it hadn't done anything and that the next handler should instead be tried. Each handler would thus be called until one handler says that is for me, says the user is valid and returns OK or returns an error rejecting it.

      If you wanted to write these multiple authentication handlers in Python you can't do it. This is because the way mod_python works, if you return DECLINED it would actually skip the remainder of the mod_python declared handlers whereas you still want them to be executed. Apache would still execute any other C based handlers in the phase though. The only way to get mod_python to execute later mod_python handlers in the phase is to return OK, but if you do that and it happens to be the last handler in the mod_python list of handlers, it will return OK to Apache and Apache will then think a handler successfully handled it and not then execute any subsequent C based handlers in that phase.

      There are going to be other sorts of problems with phases implemented using AP_IMPLEMENT_HOOK_RUN_ALL as well, as a handler that validly returns DECLINED to say it didn't do anything will cause mod_python to skip later mod_python handlers as well. If it were only C based handlers, that wouldn't be the case.

      In summary, it doesn't work how it probably should.

      Note that the above relates to phases other than content handler. Still have to work out what Apache does for content handler phase when there are multiple handlers for the phase.

      No one has probably noticed these problems as no one seems to use mod_python in a serious way for implementing these other phases, simply using mod_python as a jumping off point for content handlers.

        Activity

        Hide
        grahamd Graham Dumpleton added a comment -

        Missed a "not" in an important place again.

        Show
        grahamd Graham Dumpleton added a comment - Missed a "not" in an important place again.
        Hide
        grahamd Graham Dumpleton added a comment -

        Content handlers also seem to work differently in mod_python than Apache itself when using C handlers or even mod_perl.

        Specifically, in Apache/mod_perl, if multiple modules have registered their desire to be the content handler for a request, Apache will try them each in turn until one returns OK or aborts the transaction with an error code. If a handler returns DECLINED, Apache moves on to the next module in the list.

        In mod_python it will abort if DECLINED is returned and continue if OK is returned.

        The big issue if this were to be fixed is are people using more than one handler in a phase and thus a fix would break existing code. Maybe it should be fixed to work how it really should, with a flag that can be turned on using PythonOption to allow it to run in the old suspect way.

        Comments anyone?????

        Show
        grahamd Graham Dumpleton added a comment - Content handlers also seem to work differently in mod_python than Apache itself when using C handlers or even mod_perl. Specifically, in Apache/mod_perl, if multiple modules have registered their desire to be the content handler for a request, Apache will try them each in turn until one returns OK or aborts the transaction with an error code. If a handler returns DECLINED, Apache moves on to the next module in the list. In mod_python it will abort if DECLINED is returned and continue if OK is returned. The big issue if this were to be fixed is are people using more than one handler in a phase and thus a fix would break existing code. Maybe it should be fixed to work how it really should, with a flag that can be turned on using PythonOption to allow it to run in the old suspect way. Comments anyone?????
        Hide
        grahamd Graham Dumpleton added a comment -

        From a bit of discussion on mailing list, have come to conclusion that how
        content handlers are treated should stay the same. For other phases,
        should be made to work how Apache does things. Final summary post
        from mailing list below.

        Okay, I think I have a good plan now.

        To summarise the whole issue, the way Apache treats multiple handlers in
        a single phase for non content handler phases is as follows:

        PostReadRequestHandler RUN_ALL
        TransHandler RUN_FIRST
        MapToStorageHandler RUN_FIRST
        InitHandler RUN_ALL
        HeaderParserHandler RUN_ALL
        AccessHandler RUN_ALL
        AuthenHandler RUN_FIRST
        AuthzHandler RUN_FIRST
        TypeHandler RUN_FIRST
        FixupHandler RUN_ALL

        LogHandler RUN_ALL

        RUN_ALL means run all handlers until one returns something other than OK
        or DECLINED. Thus, handler needs to return DONE or an error to have it stop
        processing for that phase.

        RUN_FIRST means run all handlers while they return DECLINED. Thus, needs
        handler to return OK, DONE or error to have it stop processing for that phase.

        Where multiple handlers are registered within mod_python for a single
        phase it doesn't behave like either of these. In mod_python it will keep
        running the handlers only while OK is returned. Returning DECLINED
        causes it to stop. This existing behaviour can be described (like mod_perl)
        as stacked handlers.

        Having non content handler phases behave differently to how Apache does
        it causes problems. For example things like a string of authentication
        handlers which only say OK when they handle the authentication type,
        can't be implemented properly. In Apache, it should stop the first time
        one returns OK, but in mod_python it will keep running the handlers
        in that phase.

        In summary, it needs to behave more like Apache for the non content
        handler phases.

        In respect of the content handler phase itself, in practice only one handler
        module is supposed to implement it. At the Apache level there is no
        concept of different Apache modules having goes at the content handler
        phase and returning DECLINED if they don't want to handle it. This is
        reflected in how in the type handler phase, selection of the module to
        deliver content is usually done by setting the single valued req.handler
        string. Although, when using mod_python this is done implicitly by
        setting the SetHandler/AddHandler directives and mod_negotiation then
        in turn setting req.handler to be mod_python for you.

        Because mod_python when executed for the content handler phase is
        the only thing generating the content, the existing mechanism of
        stacked handlers and how the status is handled is fine within just
        the content handler phase. Can thus keep that as is and no chance of
        stuffing up existing systems.

        Where another phase calls req.add_handler() to add a handler or multiple
        handlers for the "PythonHandler" (content) phase, these will be added in
        a stacked manner within that phase. This also is the same as it works now.
        There would be non need to have a new function to add stacked handlers
        as that behaviour would be dictated by phase being "PythonHandler".

        For all the non content handler phases though, the current stacked
        handlers algorithm used by mod_python would be replaced with how
        Apache does it. That is, within multiple handlers registered with mod_python
        for non content handler phase, it would use RUN_FIRST or RUN_ALL
        algorithm as appropriate for the phase.

        For those which use RUN_ALL, this wouldn't be much different than what
        mod_python does now except that returning DECLINED would cause it
        to go to next mod_python handler in that phase instead of stopping.
        It is highly unlikely that this change would have an impact as returning
        DECLINED in RUN_ALL phases for how mod_python currently implements
        it, tends not to be useful and can't see that anyone would have been using it.

        For those which use RUN_FIRST, the difference would be significant as
        reurning OK will now cause it to stop instead of going to next mod_python
        handler in the phase. Personally I don't think this would be a drama as
        not many people would be using these phases and highly unlikely that
        someone would have listed multiple handlers for such phases. If they had
        and knew what they were doing, they should have long ago realised that
        the current behaviour was a bit broken and it even probably stopped them
        from doing what they wanted unless they fudged it.

        As to use of req.add_handler() for non content handler phases, each call
        would create a distinct handler, ie., no stacking of handlers at all. No
        separate function is required though, as slight change in behaviour
        determine form phase specified.

        To sum up, I think these changes would have minimal if no impact as
        where changes are significant, it isn't likely to overlap with existing code
        as shortcomings in current system would have mean't people wouldn't
        have been doing the sorts of things that may have been impacted.

        Therefore, I don't see a need for this to be switch enabled and the
        change could just be made and merely documented.

        Luckily the changes to make it work like above should be fairly easy. All
        it will entail is changing CallBack.HandlerDispatch() to treat status
        differently dependent on phase. No changes to req.add_handler() or
        code processing directives will be required.

        Show
        grahamd Graham Dumpleton added a comment - From a bit of discussion on mailing list, have come to conclusion that how content handlers are treated should stay the same. For other phases, should be made to work how Apache does things. Final summary post from mailing list below. Okay, I think I have a good plan now. To summarise the whole issue, the way Apache treats multiple handlers in a single phase for non content handler phases is as follows: PostReadRequestHandler RUN_ALL TransHandler RUN_FIRST MapToStorageHandler RUN_FIRST InitHandler RUN_ALL HeaderParserHandler RUN_ALL AccessHandler RUN_ALL AuthenHandler RUN_FIRST AuthzHandler RUN_FIRST TypeHandler RUN_FIRST FixupHandler RUN_ALL LogHandler RUN_ALL RUN_ALL means run all handlers until one returns something other than OK or DECLINED. Thus, handler needs to return DONE or an error to have it stop processing for that phase. RUN_FIRST means run all handlers while they return DECLINED. Thus, needs handler to return OK, DONE or error to have it stop processing for that phase. Where multiple handlers are registered within mod_python for a single phase it doesn't behave like either of these. In mod_python it will keep running the handlers only while OK is returned. Returning DECLINED causes it to stop. This existing behaviour can be described (like mod_perl) as stacked handlers. Having non content handler phases behave differently to how Apache does it causes problems. For example things like a string of authentication handlers which only say OK when they handle the authentication type, can't be implemented properly. In Apache, it should stop the first time one returns OK, but in mod_python it will keep running the handlers in that phase. In summary, it needs to behave more like Apache for the non content handler phases. In respect of the content handler phase itself, in practice only one handler module is supposed to implement it. At the Apache level there is no concept of different Apache modules having goes at the content handler phase and returning DECLINED if they don't want to handle it. This is reflected in how in the type handler phase, selection of the module to deliver content is usually done by setting the single valued req.handler string. Although, when using mod_python this is done implicitly by setting the SetHandler/AddHandler directives and mod_negotiation then in turn setting req.handler to be mod_python for you. Because mod_python when executed for the content handler phase is the only thing generating the content, the existing mechanism of stacked handlers and how the status is handled is fine within just the content handler phase. Can thus keep that as is and no chance of stuffing up existing systems. Where another phase calls req.add_handler() to add a handler or multiple handlers for the "PythonHandler" (content) phase, these will be added in a stacked manner within that phase. This also is the same as it works now. There would be non need to have a new function to add stacked handlers as that behaviour would be dictated by phase being "PythonHandler". For all the non content handler phases though, the current stacked handlers algorithm used by mod_python would be replaced with how Apache does it. That is, within multiple handlers registered with mod_python for non content handler phase, it would use RUN_FIRST or RUN_ALL algorithm as appropriate for the phase. For those which use RUN_ALL, this wouldn't be much different than what mod_python does now except that returning DECLINED would cause it to go to next mod_python handler in that phase instead of stopping. It is highly unlikely that this change would have an impact as returning DECLINED in RUN_ALL phases for how mod_python currently implements it, tends not to be useful and can't see that anyone would have been using it. For those which use RUN_FIRST, the difference would be significant as reurning OK will now cause it to stop instead of going to next mod_python handler in the phase. Personally I don't think this would be a drama as not many people would be using these phases and highly unlikely that someone would have listed multiple handlers for such phases. If they had and knew what they were doing, they should have long ago realised that the current behaviour was a bit broken and it even probably stopped them from doing what they wanted unless they fudged it. As to use of req.add_handler() for non content handler phases, each call would create a distinct handler, ie., no stacking of handlers at all. No separate function is required though, as slight change in behaviour determine form phase specified. To sum up, I think these changes would have minimal if no impact as where changes are significant, it isn't likely to overlap with existing code as shortcomings in current system would have mean't people wouldn't have been doing the sorts of things that may have been impacted. Therefore, I don't see a need for this to be switch enabled and the change could just be made and merely documented. Luckily the changes to make it work like above should be fairly easy. All it will entail is changing CallBack.HandlerDispatch() to treat status differently dependent on phase. No changes to req.add_handler() or code processing directives will be required.
        Hide
        grahamd Graham Dumpleton added a comment -

        Linked issues which will be addressed by rewritten module importer and top level handler dispatcher.

        Show
        grahamd Graham Dumpleton added a comment - Linked issues which will be addressed by rewritten module importer and top level handler dispatcher.
        Hide
        grahamd Graham Dumpleton added a comment -

        This change was in the new importer code, but has now been incorporated into the current importer code as well. Thus dependency on new importer code no longer exists.

        Show
        grahamd Graham Dumpleton added a comment - This change was in the new importer code, but has now been incorporated into the current importer code as well. Thus dependency on new importer code no longer exists.
        Hide
        grahamd Graham Dumpleton added a comment -

        After further experimentation while going over changes in mod_python 3.3 to ensure that all looks okay, the idea of keeping the content handler phase the same as what it was previously is actually quite limiting and one would loose some nice abilities just to ensure compatibility with just a few, if any at all, existing applications given that there is not really any evidence that anyone uses stacked content handlers. It would be much better to set it now to what is the best way that it can be done.

        The behaviour as it stands for stacked handlers in the content handler phase is that each handler will in turn be executed while ever they return apache.OK. Thus, if a HTTP error is returned, or if apache.DONE or apache.DECLINED is returned, further processing of stacked handlers is terminated.

        What is a pain in this is that returning apache.DECLINED causes subsequent stacked handlers not to be run with value of apache.DECLINED. This is contrary to how handlers work in every other phase of Apache processing.

        The consequence of returning apache.DECLINED is that it causes Apache to immediately fallback to executing the default-handler and thus attempt to serve up a stack file if one exists to satisfy the request. It does not provide any opportunity to attempt to run other mod_python based content handlers.

        It is much more useful to have returning apache.DECLINED simply cause execution to move onto the next stacked handler, thus preserving its original intent at this level that the handler didn't wish to service the request. This allows a series of content handlers to be attempted with appropriate one doing what it needs to and then returning apache.DONE to indicate that it has completed the request thus stopping subsequent stacked handlers to be run. At the moment one can't do this properly without using hacks which sit on top of mod_python rather than using mod_python itself.

        Note that although returning apache.OK also results in the next stacked handler then being executed much like apache.DECLINED would, the difference comes where the handler is the last one as the return status of the last one becomes the overall return value of the content phase.

        Thus, if one has stacked handlers and all return apache.DECLINED indicating that none want to service the request, one still gets the logical result that control then falls back to the default-handler.

        If one looks at the most probable way that stacked handlers might have been used in existing code, for example:

        PythonHandler header body footer

        Every handler would have simply returned apache.OK. As the last handler is returning apache.OK, one still gets the same result anyway and thus there is practically no risk that existing code would be stuffed up anyway.

        Same deal when there is only a single content handler, still works like it does now and as expected.

        FWIW, a content handler should probably always return apache.DONE rather than apache.OK except for where stacked handlers are using to complete the request in multiple parts like above.

        Finally, as the new way of interpreteing return status codes from handlers is only implemented when the new module importer is used, if someone does have code that somehow relies on the old behaviour, they can simply enable the old importer and there code will keep working until they can change how they do things to what is the more logical way. Thus, a migration path of sorts is available.

        Thus, my intention it to make change as described.

        Show
        grahamd Graham Dumpleton added a comment - After further experimentation while going over changes in mod_python 3.3 to ensure that all looks okay, the idea of keeping the content handler phase the same as what it was previously is actually quite limiting and one would loose some nice abilities just to ensure compatibility with just a few, if any at all, existing applications given that there is not really any evidence that anyone uses stacked content handlers. It would be much better to set it now to what is the best way that it can be done. The behaviour as it stands for stacked handlers in the content handler phase is that each handler will in turn be executed while ever they return apache.OK. Thus, if a HTTP error is returned, or if apache.DONE or apache.DECLINED is returned, further processing of stacked handlers is terminated. What is a pain in this is that returning apache.DECLINED causes subsequent stacked handlers not to be run with value of apache.DECLINED. This is contrary to how handlers work in every other phase of Apache processing. The consequence of returning apache.DECLINED is that it causes Apache to immediately fallback to executing the default-handler and thus attempt to serve up a stack file if one exists to satisfy the request. It does not provide any opportunity to attempt to run other mod_python based content handlers. It is much more useful to have returning apache.DECLINED simply cause execution to move onto the next stacked handler, thus preserving its original intent at this level that the handler didn't wish to service the request. This allows a series of content handlers to be attempted with appropriate one doing what it needs to and then returning apache.DONE to indicate that it has completed the request thus stopping subsequent stacked handlers to be run. At the moment one can't do this properly without using hacks which sit on top of mod_python rather than using mod_python itself. Note that although returning apache.OK also results in the next stacked handler then being executed much like apache.DECLINED would, the difference comes where the handler is the last one as the return status of the last one becomes the overall return value of the content phase. Thus, if one has stacked handlers and all return apache.DECLINED indicating that none want to service the request, one still gets the logical result that control then falls back to the default-handler. If one looks at the most probable way that stacked handlers might have been used in existing code, for example: PythonHandler header body footer Every handler would have simply returned apache.OK. As the last handler is returning apache.OK, one still gets the same result anyway and thus there is practically no risk that existing code would be stuffed up anyway. Same deal when there is only a single content handler, still works like it does now and as expected. FWIW, a content handler should probably always return apache.DONE rather than apache.OK except for where stacked handlers are using to complete the request in multiple parts like above. Finally, as the new way of interpreteing return status codes from handlers is only implemented when the new module importer is used, if someone does have code that somehow relies on the old behaviour, they can simply enable the old importer and there code will keep working until they can change how they do things to what is the more logical way. Thus, a migration path of sorts is available. Thus, my intention it to make change as described.

          People

          • Assignee:
            grahamd Graham Dumpleton
            Reporter:
            grahamd Graham Dumpleton
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development