Uploaded image for project: 'mod_python'
  1. mod_python
  2. MODPYTHON-193

Add req.hlist.location to mirror req.hlist.directory.

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.3.1
    • core
    • None

    Description

      In mod_python 3.3 a new function is available when the new module importer is
      used called apache.get_handler_root(). The purpose of the function is to return
      the directory specified by the Directory directive in which the current
      Python*Handler was defined within. In the case of DirectoryMatch being used or
      Directory with ~ match, the value returned will always have any wildcards or
      regular expressions expanded and will show the true physical directory matched
      by Apache for the request.

      This function is effectively a wrapper around the value of req.hlist.directory,
      but is actually a bit more complicated than that. The reason there is a bit
      more to it than that, is that the function is actually callable while modules
      are being imported, ie., outside of the context of the actual request handler.
      It is able to be called in this way, as the new importer sets up a per thread
      cache where it stashes the information for access for the life of the request.

      Further complications arise where req.add_handler() is used and no handler path
      is supplied as last argument to this function. In that case req.hlist.directory
      is None, but the handler path associated with the context in which
      req.add_handler() was called can be determined by tracking back through
      req.hlist.parent until the directory attribute is specified. To avoid a user
      doing this, the value that apache.get_handler_root() returns has already had
      that done where necessary.

      The reason for making the handler root available when modules are being
      imported, as it then makes it a lot easier for web applications to use the
      directory that Python*Handler directive was defined for as an anchor point for
      the application code, with access to further module imports or config files
      being made in respect of this directive dynamically rather than have to hard
      code paths in the Apache configuration using PythonOption. In using this
      though, one does have to be careful that modules aren't shared between two
      handler roots by using PythonInterpreter to separate two distinct web
      applications when necessary.

      This is all well and good if the Directory/DirectoryMatch directives are used,
      but useless if the Location/LocationMatch directives are used. Where these are
      currently used, apache.get_handler_root() and req.hlist.directory yield '/'. I
      think originally I had the code returning an empty string, but when support for
      expansion of wildcards was added and path normalisation done, the '/' was
      getting returned instead.

      For starters, instead of '/' the None value should be the result where
      Location/LocationMatch directives are used. Second, there should really be an
      equivalent to req.hlist.location which yields the leading part or the URL which
      corresponds to the directory stored in req.hlist.directory. In effect this is
      yielding an absolute base URL and would mean that it would no longer be
      necessary to perform calculations like described in:

      http://www.modpython.org/pipermail/mod_python/2006-March/020501.html

      for calculating handler base URLs where Directory/DirectoryMatch is used,
      something that most people seem to get wrong from what I have seen.

      An important thing about that code is that it only works for when
      Directory/DirectoryMatch is used. There is actually no way (at least that I
      know of), for actually determining what the expanded path corresponding to a
      Location/LocationMatch directive is. This is a major grumbling point for
      packages like Trac, MoinMoin, Django and TurboGears, as it means that they have
      to require the user to manually duplicate the path to the directive in a
      PythonOption or using some other configuration mechanism so that the package
      knows where its root URL is.

      Thus, if req.hlist.location can be supplied, this would solve this problem. In
      respect of apache.get_handler_root(), am not sure there really should be an
      equivalent within the apache module as knowing the location at the time of
      import sounds a bit dubious to me even if it might be useful if a package
      performs configuration at time of import. It would be much more sensible for a
      package to use the req.hlist.location value at the time of each request. One
      option is to add a req.base_uri attribute or req.get_base_uri() method to the
      request object. This would take into consideration the need to recurse back
      through parent handler contexts where req.add_handler() is used, like with
      req.hlist.directory.

      In summary:

      1. Change code so req.hlist.directory is None where Location/LocationMatch
      directive is used.

      2. Add req.hlist.location which gives the base URL, ie., leading path of URL,
      which equates to the directory specified by req.hlist.directory where the
      directory has come from the Apache configuration.

      3. Look at adding a new method or attribute to request object which provides
      the base URL with value being inherited from parent handler contexts where
      appropriate. Would need to select an appropriate name for this.

      I think this is important enough to sneak it into mod_python 3.3, then we can
      silence those other packages who grumble that it can't be determined.

      Attachments

        Activity

          When this is implemented, code in Session class can be changed from:

          1. the path where *Handler directive was specified
            dirpath = self._req.hlist.directory
            if dirpath:
            docroot = self._req.document_root()
            c.path = dirpath[len(docroot):]
            else:
            c.path = '/'
          1. Sometimes there is no path, e.g. when Location
          2. is used. When Alias or UserDir are used, then
          3. the path wouldn't match the URI. In those cases
          4. just default to '/'
            if not c.path or not self._req.uri.startswith(c.path):
            c.path = '/'

          to something that uses req.hlist.location instead. It will need to traverse through parent contexts if necessary to find point that req.hlist.location is not None.

          There is a small chance that making this change will cause problems with existing setups which are relying on the default being '/' when Location directive is set.

          Comment in above code suggests need to make sure the original change proposed works correctly when UserDir or Alias comes into play.

          grahamd Graham Phillip Dumpleton added a comment - When this is implemented, code in Session class can be changed from: the path where *Handler directive was specified dirpath = self._req.hlist.directory if dirpath: docroot = self._req.document_root() c.path = dirpath [len(docroot):] else: c.path = '/' Sometimes there is no path, e.g. when Location is used. When Alias or UserDir are used, then the path wouldn't match the URI. In those cases just default to '/' if not c.path or not self._req.uri.startswith(c.path): c.path = '/' to something that uses req.hlist.location instead. It will need to traverse through parent contexts if necessary to find point that req.hlist.location is not None. There is a small chance that making this change will cause problems with existing setups which are relying on the default being '/' when Location directive is set. Comment in above code suggests need to make sure the original change proposed works correctly when UserDir or Alias comes into play.

          Implements items 1 and 2, although for 2 only supply req.hlist.location for when Location/LocationMatch directive is actually used. Did not try and synthesise a location when Directory directive is used as process depends on values of req.uri/req.filename/req.path_info, which can be modified by handlers with such modifications giving incorrect results. Did not do item 3.

          Have also not made changes to Session class as it separately has a number of problems in that section of code and they cannot all be addressed properly. Will open a separate issue for Session path problems.

          grahamd Graham Phillip Dumpleton added a comment - Implements items 1 and 2, although for 2 only supply req.hlist.location for when Location/LocationMatch directive is actually used. Did not try and synthesise a location when Directory directive is used as process depends on values of req.uri/req.filename/req.path_info, which can be modified by handlers with such modifications giving incorrect results. Did not do item 3. Have also not made changes to Session class as it separately has a number of problems in that section of code and they cannot all be addressed properly. Will open a separate issue for Session path problems.

          People

            grahamd Graham Phillip Dumpleton
            grahamd Graham Phillip Dumpleton
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: