Issue Details (XML | Word | Printable)

Key: MODPYTHON-171
Type: Bug Bug
Status: Closed Closed
Resolution: Won't Fix
Priority: Major Major
Assignee: Graham Dumpleton
Reporter: Graham Dumpleton
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
mod_python

Assignment to req.filename and POSIX style pathnames.

Created: 07/May/06 05:48 PM   Updated: 17/Apr/07 10:49 AM
Return to search
Component/s: core
Affects Version/s: 3.2.8
Fix Version/s: None

Time Tracking:
Not Specified

Resolution Date: 08/Oct/06 10:15 AM


 Description  « Hide
In Apache, all the path names relating to the matched target of a request are dealt with as POSIX style paths. That is, a forward slash is used as the directory separator even if the platform is Win32. The only real allowance for Win32 stuff is that drive specifiers may still occur in which case the drive letter is always converted to upper case.

All the Apache C API functions dealing with manipulation of and specifically generation of modified paths will by default ensure that paths are maintained in this POSIX style. To have a path be generated in its true native form, you need to provide special flags to functions.

When an Apache module writer works with paths, they would normally rely on the default behaviour and so long as they use the functions provided by the Apache C API, the result will always be consistent.

Where would all this be a potential issue is where modules set the request_rec->filename attribute, ie., the req.filename attribute of the mod_python request object. In a C Apache module, as the result is always going to be in the correct form when request_rec->filename is modified, everything still comes out okay.

The problem in mod_python, or more perhaps when using Python, is that all the directory manipulation routines in os.path as they exist on Win32 platform can generate paths with back slashes in them. Further, it is often convenient to use __file__ attribute of modules in some way, which again is going to use back slashes on Win32 platform. If the results from either of these is assigned to req.filename, the result request_rec->filename attribute is no longer going to be in the POSIX style form which is would normally exist if only the Apache C APIs were used.

One area where this causes a problem (and which isn't fixed) was described in MODPYTHON-161, whereby setting req.filename to a path which includes back slashes instead of the required POSIX style forward slashes can result in the wrong interpreter being selected for a subsequent phase if the PythonInterpPerDirectory directive is being used. The case used for any drive specifier could similarly be a problem.

Now although Python provides os.path.normpath(), that normalises a path in the native format. There is no function which can normalise a path and output it in the POSIX style format. Trying to create a function in Python which does may not yield the same result as what Apache expects.

The actual function in Apache which can be used to normalise paths and which outputs the POSIX style path required is apr_filepath_merge(). The question is, should this be exposed in some way so that it is useable from mod_python, or for the req.filename case, should assignment to req.filename automatically trigger normalisation of the path to ensure that it simply just works all the time and isn't dependent on a user of mod_python realising they need to normalise it first in the POSIX style to ensure their code is portable across platforms.




 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Graham Dumpleton added a comment - 26/Sep/06 01:12 AM
Unfortunately, one cannot auto normalise paths when assigning to req.filename. This is because some Apache modules use req.filename to allow handlers to communicate information to them. An example of this is the proxy-server handler (MODPYTHON-141). By auto normalising the path on assignment to req.filename, this can stuff up a specially formatted string which the third party module is expecting.

What will have to be done instead is for a new function to be supplied in the 'mod_python.apache' module whose soul purpose is to normalise paths. In effect this will be a wrapper for the apr_filepath_merge() function. Because the whole functionaility of that APR function isn't required and probably shouldn't be exposed, for now suggest that the function provided follow more the Python naming convention. Thus, would be called normpath().

Thus, if assigning to req.filename, would use:

  from mod_python import apache

  req.filename = apache.normpath('/some/path')

In time it may make sense to add to the 'mod_python.apache' module other functions the equivalent of what is in os.path/posixpath. The reason for this is that Apache supplies req.filename and req.uri in a POSIXish style format, but one sees quite a lot that people use os.path functions in their code when by rights they should use posixpath functions to be portable. The posixpath functions don't necessarily though treat drive specifiers on Windows as may be provided by Apache in req.filename the correct way. To be as compatible as possible, the functions added to 'mod_python.apache' should just wrap the APR functions which provide equivalent functionality.

Graham Dumpleton added a comment - 08/Oct/06 10:15 AM
Can't auto normalise on assignment to req.filename as detailed so marking this as "won't fix".

Will create a separate issue for the idea of creating equivalents to the os.path/posixpath functions which are compatible with working on the Apache POSIXish style paths with allowance for drive letters on Win32, specifically the value of req.filename.