Issue Details (XML | Word | Printable)

Key: MODPYTHON-22
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Nicolas Lehuen
Reporter: Graham Dumpleton
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
mod_python

mod_python.publisher extension handling

Created: 26/Feb/05 01:38 PM   Updated: 05/Mar/06 01:41 PM
Return to search
Component/s: None
Affects Version/s: 3.1.3, 3.1.4
Fix Version/s: 3.2.7

Time Tracking:
Not Specified

Resolution Date: 30/Apr/05 03:50 PM


 Description  « Hide
The following code in mod_python.publisher doesn't appear to be correct.

  imp_suffixes = " ".join([x[0][1:] for x in imp.get_suffixes()])

    # get rid of the suffix
    # explanation: Suffixes that will get stripped off
    # are those that were specified as an argument to the
    # AddHandler directive. Everything else will be considered
    # a package.module rather than module.suffix
    exts = req.get_addhandler_exts()
    if not exts:
        # this is SetHandler, make an exception for Python suffixes
        exts = imp_suffixes
    if req.extension: # this exists if we're running in a | .ext handler
        exts += req.extension[1:]
    if exts:
        suffixes = exts.strip().split()
        exp = "\\." + "$|\\.".join(suffixes)
        suff_matcher = re.compile(exp) # python caches these, so its fast
        module_name = suff_matcher.sub("", module_name)

For starters, imp.get_suffixes() returns:

    [('.so', 'rb', 3), ('module.so', 'rb', 3), ('.py', 'U', 1), ('.pyc', 'rb', 2)]

on a UNIX platform. Thus yielding for imp_suffixes:

  'so odule.so py pyc'

Ie., "m" has been lost from "module.so". As it is likely that a dynamically
loaded C module isn't going to be usable within publisher, this is no loss
at this point.

Now if one were using:

  SetHandler python-program
  PythonHandler mod_python.publisher | .xxx
  PythonHandler mod_python.publisher | .py

then req.get_addhandler_exts() returns:

  ''

Ie., empty string. Thus, "exts" gets set to imp_suffixes.

  exts = 'so odule.so py pyc'

Now because ".py" and ".xxx" was specified on the PythonHandler line, req.extension
is set to be ".py" or ".xxx" as appropriate for request. When this is appended to "exts",
no space is added and thus result for ".py" request is:

  exts = 'so odule.so py pycpy'

For a ".py" extension this is no drama as it is already listed in exts at that
point and thus things still work okay.

The lack of a space though does screw things for ".xxx" though. If one used a URL
with a .xxx extension, one would expect it to drop the extension and still work,
but because exts gets rewritten as:

  exts = 'so odule.so py pycxxx'

it doesn't work, instead you get not found error.

The only way around it is to use:

  SetHandler python-program
  AddHandler python-program .xxx
  PythonHandler mod_python.publisher | .xxx
  PythonHandler mod_python.publisher | .py

Ie., as well as SetHandler, define AddHandler for at least one extension type.
In doing this, req.get_addhandler_exts() now returns:

  'xxx '

Notice how there is a space at the end of the string. That it is set means exts
isn't set to imp_suffixes. When req.extension is added the existing space means
all is okay and result will be:

  'xxx xxx'

In summary, instead of:

  exts += req.extension[1:]

should be equivalent of:

  exts += ' '
  exts += req.extension[1:]

Do note that in case of req.get_addhandler_exts() returning something, this
will mean spaces are doubled up. This appears to be okay as split() function
treats adjoining spaces as a single field separator when splitting string.

Another bit of code which is a bit loose is:

  exp = "\\." + "$|\\.".join(suffixes)

When this is applied to:

   'xxx xxx'

one gets:

  '\\.xxx$|\\.xxx'

The intent of the regular expression when applied is that it will remove the
extension just from the end of the URL. Ie., "foo.xxx" yields "foo". However,
because there is no '$' on the very last part of the pattern, any instance of
the last extension will be removed from anywhere in the string. ie.:

  suff_matcher.sub("","aaa.xxxbbb.foo.xxx")

yields:

  'aaabbb.foo'

instead of:

  "aaa.xxxbbb.foo"

At the moment this may not be an issue as the way Apache does its matching
and Python does its module lookup, means that neither would produce a valid
result in the first place even if one were to have files with '.' in the actual name
besides that used for the extension. :-)

The missing space still needs to be fixed though.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
There are no subversion log entries for this issue yet.