mod_python
  1. mod_python
  2. MODPYTHON-146

ap_internal_fast_redirect() and request object cache

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.2.8
    • Fix Version/s: None
    • Component/s: core
    • Labels:
      None

      Description

      mod_python uses a Python class to wrap the Apache request_rec structure. The primary purpose of the request object wrapper is to access the request_rec internals. One of the other features of the request object wrapper is that handlers can add their own attributes to it, to facilitate communication of information between handlers. This communication of information between handlers works because a handler will lookup to see if a request object has already been created for the request as a whole before creating a fresh request object wrapper, and will use the existing one instead.

      All in all this generally works okay, however, the DirectoryIndex directive and the ap_internal_fast_redirect() do cause undesirable behaviour in specific cases.

      Now when a request is made against a directory, this is detected by mod_dir, which in a LAST hooked handler in the fixup handler phase, will use ap_internal_fast_redirect() to determine if any of the files mentioned in the DirectoryIndex directive exist. What this function does is run through all request phases up to and including the fixup handler phase for the file which would be matched for the entry in DirectoryIndex. If the status comes back OK indicating the request could be satisfied, it copies all the information from the sub request request_rec structure into the parent request_rec structure. It will then proceed with this information to execute the content handler phase.

      The problem is that ap_internal_fast_redirect() knows only about the request_rec structure and nothing about the Python request object wrapper. As a consequence, the request object created for the sub request which worked and ran through to the fixup handler phase is being ignored and that originally created for the parent request continues to be used. As a consequence, any of the attributes added by handler phases up to and including the fixup handler are lost.

      What possibly needs to be done is that the get_request_object() function in mod_python needs to add to req->notes a tag which identifies the instance of the request object which has been created. Because req->notes will be overlayed by the notes table contents from the sub request, it will be able to detect when this copy of sub request data into the parent has occured. It can then decide to switch to the request object created for the sub request, updating the request_rec member to point to the parent request_rec instead.

      What may also come out of of storing an id for a request object in the req->notes table is that when an internal redirect occurs, instead of a fresh request object wrapper instance being created to use for the req.main attribute, it can use the id in req->notes to actually get hold of the real request object of the parent and chain to it properly. Doing this will mean then that a sub request will be able to access attributes added into a request object of the parent request, something which can't currently be done.

      Now, if you understand everything I have said above, you have done well.

      Depending on whether people do understand or not, when I get a chance I'll try and attach some examples of handlers which demonstrate he problem.

      Acknowledgements that you understand the issue appreciated.

        Activity

        Graham Dumpleton created issue -
        Hide
        Graham Dumpleton added a comment -

        The idea of trying to use req->notes to store an ID for the request object to be used will in practice not be able to be used, or not easily. This is because ap_internal_fast_redirect() wrongly merges the contents of the notes table from main request and subrequest. This issue has been brought up on main Apache developers mailing list:

        http://www.mail-archive.com/dev%40httpd.apache.org/msg31263.html

        The responses indicate it is a bug, but indications are that it possibly will not be truly fixed until Apache 2.4 by using a proper internal redirect rather than a fast redirect.

        http://www.mail-archive.com/dev%40httpd.apache.org/msg31264.html
        http://www.mail-archive.com/dev%40httpd.apache.org/msg31265.html
        http://www.mail-archive.com/dev%40httpd.apache.org/msg31266.html

        Even if mod_python tried to implement a solution for use of the wrong request object, the way that req.notes is wrongly merged from sub request to main request means that data in that table is going to be stuffed up and possibly not useable. As a consequence, there doesn't seem to be much point in trying to fix this problem.

        The only workaround for this whole issue at this point is for a fixup handler to be implemented by a user which hijacks the DirectoryIndex mapping mechanism and which determines what file to redirect to and execute an internal redirect instead. For example:

        import os
        from mod_python import apache

        def fixuphandler(req):
        if os.path.isdir(req.filename):
        uri = req.uri + 'index.html'
        if req.args: uri += '?' + req.args
        req.internal_redirect(uri)
        return apache.DONE
        return apache.OK

        This should really just use request_rec->finfo->filetype, but mod_python doesn't seem to expose the file type attribute through the request object. Thus, necessary to perform a redundant check of whether target is a directory.

        Alternatively, could use:

        def fixuphandler(req):
        if req.content_type == 'httpd/unix-directory':
        uri = req.uri + 'index.html'
        if req.args: uri += '?' + req.args
        req.internal_redirect(uri)
        return apache.DONE
        return apache.OK

        as mod_mime seems to set content type to this magic value when request_rec->finfo->filetype is APR_DIR, ie., a directory. Should mod_python provide this magic content type as apache.DIR_MAGIC_TYPE?

        Note that these examples assume that index file exists, but handler could just as easily iterate over a list of possible targets until one is found. Would get even more complicated if one had to consider different choices based on language.

        Show
        Graham Dumpleton added a comment - The idea of trying to use req->notes to store an ID for the request object to be used will in practice not be able to be used, or not easily. This is because ap_internal_fast_redirect() wrongly merges the contents of the notes table from main request and subrequest. This issue has been brought up on main Apache developers mailing list: http://www.mail-archive.com/dev%40httpd.apache.org/msg31263.html The responses indicate it is a bug, but indications are that it possibly will not be truly fixed until Apache 2.4 by using a proper internal redirect rather than a fast redirect. http://www.mail-archive.com/dev%40httpd.apache.org/msg31264.html http://www.mail-archive.com/dev%40httpd.apache.org/msg31265.html http://www.mail-archive.com/dev%40httpd.apache.org/msg31266.html Even if mod_python tried to implement a solution for use of the wrong request object, the way that req.notes is wrongly merged from sub request to main request means that data in that table is going to be stuffed up and possibly not useable. As a consequence, there doesn't seem to be much point in trying to fix this problem. The only workaround for this whole issue at this point is for a fixup handler to be implemented by a user which hijacks the DirectoryIndex mapping mechanism and which determines what file to redirect to and execute an internal redirect instead. For example: import os from mod_python import apache def fixuphandler(req): if os.path.isdir(req.filename): uri = req.uri + 'index.html' if req.args: uri += '?' + req.args req.internal_redirect(uri) return apache.DONE return apache.OK This should really just use request_rec->finfo->filetype, but mod_python doesn't seem to expose the file type attribute through the request object. Thus, necessary to perform a redundant check of whether target is a directory. Alternatively, could use: def fixuphandler(req): if req.content_type == 'httpd/unix-directory': uri = req.uri + 'index.html' if req.args: uri += '?' + req.args req.internal_redirect(uri) return apache.DONE return apache.OK as mod_mime seems to set content type to this magic value when request_rec->finfo->filetype is APR_DIR, ie., a directory. Should mod_python provide this magic content type as apache.DIR_MAGIC_TYPE? Note that these examples assume that index file exists, but handler could just as easily iterate over a list of possible targets until one is found. Would get even more complicated if one had to consider different choices based on language.
        Hide
        Graham Dumpleton added a comment -

        In addition to the sub request Python request object being ignored, the ap_internal_fast_redirect() also doesn't transfer anything held in the sub request req->request_config attribute. As well as holding the Python request object, this holds information about dynamically registered handlers and input/output filters. Thus the specifics of handlers and input/output filters registered by the sub request are lost.

        Show
        Graham Dumpleton added a comment - In addition to the sub request Python request object being ignored, the ap_internal_fast_redirect() also doesn't transfer anything held in the sub request req->request_config attribute. As well as holding the Python request object, this holds information about dynamically registered handlers and input/output filters. Thus the specifics of handlers and input/output filters registered by the sub request are lost.
        Hide
        Graham Dumpleton added a comment -

        Correction to previous recipe for explicitly using an internal redirect to get around the problems as described above. The code should only perform the redirect when req.uri already ends with a slash, otherwise it will override Apache's inbuilt mechanism for adding the trailing slash by send a redirect back to the client.

        Using features now available because of other changes, should say:

        def fixuphandler(req):
        if req.finfo[apache.FINFO_FILETYPE] == apache.APR_DIR:
        if req.uri[-1] == '/':
        uri = req.uri + 'index.html'
        if req.args: uri += '?' + req.args
        req.internal_redirect(uri)
        return apache.DONE
        return apache.OK

        Show
        Graham Dumpleton added a comment - Correction to previous recipe for explicitly using an internal redirect to get around the problems as described above. The code should only perform the redirect when req.uri already ends with a slash, otherwise it will override Apache's inbuilt mechanism for adding the trailing slash by send a redirect back to the client. Using features now available because of other changes, should say: def fixuphandler(req): if req.finfo [apache.FINFO_FILETYPE] == apache.APR_DIR: if req.uri [-1] == '/': uri = req.uri + 'index.html' if req.args: uri += '?' + req.args req.internal_redirect(uri) return apache.DONE return apache.OK
        Hide
        Graham Dumpleton added a comment -

        A more through analysis of the DirectoryIndex problems was posted on the mod_python developers mailing list:

        http://www.mail-archive.com/python-dev@httpd.apache.org/msg01736.html

        Show
        Graham Dumpleton added a comment - A more through analysis of the DirectoryIndex problems was posted on the mod_python developers mailing list: http://www.mail-archive.com/python-dev@httpd.apache.org/msg01736.html

          People

          • Assignee:
            Graham Dumpleton
            Reporter:
            Graham Dumpleton
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development