Issue Details (XML | Word | Printable)

Key: MODPYTHON-128
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Graham Dumpleton
Reporter: Graham Dumpleton
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
mod_python

Have assigning req.filename automatically update req.finfo.

Created: 10/Feb/06 07:40 AM   Updated: 05/Apr/07 11:49 AM
Component/s: core
Affects Version/s: 3.3.x
Fix Version/s: 3.3.1

Time Tracking:
Not Specified

File Attachments:
  Size
File Licensed for inclusion in ASF works MP128_20060423_grahamd_1.diff 2006-04-23 05:39 PM Graham Dumpleton 1 kB
Issue Links:
Blocker
 

Resolution Date: 25/Aug/06 01:07 AM


 Description  « Hide
Although it is possible to assign a new value to "req.filename", it is not possible to update "req.finfo" based on the new filename.

Suggest that if "req.filename" is assigned a new value, that apr_stat() be automatically called to update "req.finfo". Ie., internally mod_python would do something like:

  apr_stat(&r->finfo, r->filename, APR_FINFO_MIN, r->pool);

I believe that mod_perl supports a similar feature, but would need to confirm this.

Related to "req.filename", the "req.canonical_filename" should also be writable as when changing "req.filename" the latter should also by rights be updated as well.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Graham Dumpleton added a comment - 23/Feb/06 07:53 PM
Here is link to discussion of similar change being ported from mod_perl 1 to mod_perl 2.

  http://www.gossamer-threads.com/lists/modperl/dev/8281

Their code was:

+static MP_INLINE
+char *mpxs_Apache__RequestRec_filename(pTHX_ request_rec *r,
+ SV *name)
+{
+ char *retval = r->filename;
+
+ if (name) {
+ STRLEN len;
+ const char *val = SvPV(name, len);
+
+ MP_TRACE_o(MP_FUNC, "setting r->filename to %s\n",
+ val);
+
+ /* set r->filename to the incoming value */
+ r->filename = apr_pstrndup(r->pool, val, len);
+
+ /* and update r->finfo so later calcuations work properly */
+ apr_status_t rv = apr_stat(&r->finfo, r->filename,
+ APR_FINFO_MIN, r->pool);
+
+ if (rv != APR_SUCCESS) {
+ MP_TRACE_o(MP_FUNC, "unable to update finfo for %s\n",
+ name);
+ r->finfo.filetype = 0;
+ }
+ }
+
+ return retval;
+}

Worth noting is that they set finfo.filetype to 0 if stat fails.

Consulting:

  http://docx.webperf.org/structapr__finfo__t.html
  http://docx.webperf.org/group__apr__file__info.html#gga3a66
  http://docx.webperf.org/apr__file__info_8h-source.html

rather than being assigned to zero, the constant APR_NOFILE should probably be used.

Is finfo.filetype being 0/APR_NOFILE truely indicative of data not being valid by itself?

Need to dig into apr_stat() further when have time.

Graham Dumpleton added a comment - 19/Mar/06 07:50 AM
As part of this change, a new req.filetype attribute should possibly be added. This is necessary as req.finfo returns a tuple of attributes about the file, but not the request_rec->finfo->filetype attribute.

Other alternative is for finfo to return a class like object that acts like a tuple, but also allows access by name to attributes of structure. In other words like the traditional result of the Python os.stat() function call.

To be able to meaningfully make use of the filtetype attribute, however exposed, there needs to be Python constants for:

 typedef enum {
     APR_NOFILE = 0,
     APR_REG,
     APR_DIR,
     APR_CHR,
     APR_BLK,
     APR_PIPE,
     APR_LNK,
     APR_SOCK,
     APR_UNKFILE = 127
 } apr_filetype_e;

Because this is an enum where literal values are not assigned to all values, actual literal constants cannot simply be provided in Python module as no gaurantees by ANSI C standard (I think) that successive values will be used. Thus, would need through the mod_python._apache module to expose the actual integer values for these enum values and these would then subsequently be set in the mod_python.apache module.

If existing naming practice in conjunction with variables related to finfo structure are used, the APR_ prefix would be dropped and the constants would be:

     mod_python.apache.NOFILE
     mod_python.apache.REG,
     mod_python.apache.DIR,
     mod_python.apache.CHR,
     mod_python.apache.BLK,
     mod_python.apache.PIPE,
     mod_python.apache.LNK,
     mod_python.apache.SOCK,
     mod_python.apache.UNKFILE = 127

Is the use of the unprefixed names acceptable? Do we run the risk over time of having naming collisions if we keep dropping prefixes? Should we instead from now on always keep the prefixes? If we do that, what do we do about all the mod_python.apache.FINFO_* constants? Do we provided prefixed versions, document them and deprecate use of the older variants?

Overall, there are a mix of constants in the mod_python.apache module which do an don't have prefixes. Some really core ones it would not make sense to change to now have prefixes, but some others outside of that core could perhaps be. For example all the URI_* constants. Even if we do not change anything, we need to define a rule/guideline for going forwards.



Graham Dumpleton added a comment - 19/Mar/06 07:59 AM
Put dunce cap on, stand in corner.

Naming is consistent now. Prefixes are never dropped. the FINFO_* and URI_* constants are special as there is no equivalent in Apache and they are only relevant to mod_python.

Thus, contains should therefore be:

     mod_python.apache.APR_NOFILE
     mod_python.apache.APR_REG
     mod_python.apache.APR_DIR
     mod_python.apache.APR_CHR
     mod_python.apache.APR_BLK
     mod_python.apache.APR_PIPE
     mod_python.apache.APR_LNK
     mod_python.apache.APR_SOCK
     mod_python.apache.APR_UNKFILE

Graham Dumpleton added a comment - 19/Mar/06 09:39 AM
Why can't I see the obvious today.

Probably the best/easiest solution to the filetype member being missing is to add it to the end of the existing tuple and provide a new apache.FINFO_FILETYPE attribute for accessing it. Still need new constants for working out what its value means.

Graham Dumpleton added a comment - 19/Mar/06 12:28 PM
Back to the issue of whether when req.filename is modified that req.finfo be updated automatically, the outcome in mod_perl 2.0 was that they did not preserve the mod_perl 1.0 behaviour. Instead they note the issue in documentation as:

  filename

  Get/set the filename on disk corresponding to this response (the result of the URI --> filename translation).

    $filename = $r->filename();
    $prev_filename = $r->filename($new_filename);

  since: 2.0.00
  Note that if you change the filename after the PerlMapToStorageHandler phase was run and expect Apache
  to serve it, you need to update its stat record, like so:

    use Apache2::RequestRec ();
    use APR::Finfo ();
    use APR::Const -compile => qw(FINFO_NORM);
    $r->filename($newfile);
    $r->finfo(APR::Finfo::stat($newfile, APR::Const::FINFO_NORM, $r->pool));

  if you don't, Apache will still try to use the previously cached information about the previously set value of the filename.

Now mod_perl is a bit different to mod_python in that it provides a much lower level mapping to Apache internals than what mod_python does. As such it provides a means of updating the finfo structure directly whereas mod_python doesn't. The goal in mod_perl is actually to make the API as close as possible to the Apache C API and why they decided not to update finfo automatically.

The question still remains what should mod_python do? Because it doesn't provide low level wrappers for stating files using APR routines nor of updating finfo directly, should it therefore update finfo directly, or should it provide some alternate means of updating it when desired?

For time being, will commit filetype changes and addition of constants as to what filetype means, but defer having finfo updated automatically until some consensus of mod_python developers is reached.


Graham Dumpleton added a comment - 01/Apr/06 02:15 PM
Tried to get some consensus on what to do about this issue on mod_python developers mailing list. See:

  http://www.mail-archive.com/python-dev@httpd.apache.org/msg01718.html
  http://www.mail-archive.com/python-dev@httpd.apache.org/msg01722.html
  http://www.mail-archive.com/python-dev@httpd.apache.org/msg01724.html
  http://www.mail-archive.com/python-dev@httpd.apache.org/msg01734.html
  http://www.mail-archive.com/python-dev@httpd.apache.org/msg01725.html
  http://www.mail-archive.com/python-dev@httpd.apache.org/msg01736.html

but wasn't necessarily even able to get agreement on the point of whether the ability to update finfo is even needed.

Have thus stopped working on the issue.

Note that not having this ability means that MODPYTHON-123 can't be implemented as such a handler wouldn't be able to update finfo corresponding to filename matched to.

Graham Dumpleton added a comment - 23/Apr/06 05:39 PM
Here is one last attempt to get some sort of solution for this happening.

Have attached "MP128_20060423_grahamd_1.diff" containing a proposed patch to add a distinct req.update_finfo() function. When this function is called it will do an apr_stat() on the current value of req.filename and update req.finfo correspondingly. The result of the apr_stat() can be determined by consultation of req.finfo and specifically the req.finfo[FINFO_FILETYPE] attribute. It being APR_NOFILE if the stat failed.




Graham Dumpleton added a comment - 13/Aug/06 11:12 AM
Another item which it would be nice to get resolved for mod_python 3.3.

My latest thinking on this is that we need an actual C based Python object which wraps an instance of a apr_finfo_t structure. This object would be what is returned when req.finfo is accessed. The Python object would need to preserve the existing means of accessing attributes by assuming object is a tuple and accessing values by using special index keys defined in mod_python.apache file. This means of access though would be seen as being obsolete and supported for backward compatibility only. Instead, going forward code should be converted to use attribute style access, thus we would have:

  OLD --> req.finfo[mod_python.apache.FINFO_FILETYPE]
  NEW -->req.finfo.filetype

As much as makes sense of the apr_finfo_t object would be made accessible.

In respect of the original problem of how to update req.finfo when req.filename is changed, two new functions would be added to mod_python.apache. This would be:

  mod_python.apache.stat(fname, wanted)
  mod_python.apache.lstat(fname, wanted)

The 'wanted' argument would be replaced with bitwise or of new global constants defined in mod_python equivalent to:

#define APR_FINFO_LINK 0x00000001 /**< Stat the link not the file itself if it is a link */
#define APR_FINFO_MTIME 0x00000010 /**< Modification Time */
#define APR_FINFO_CTIME 0x00000020 /**< Creation or inode-changed time */
#define APR_FINFO_ATIME 0x00000040 /**< Access Time */
#define APR_FINFO_SIZE 0x00000100 /**< Size of the file */
#define APR_FINFO_CSIZE 0x00000200 /**< Storage size consumed by the file */
#define APR_FINFO_DEV 0x00001000 /**< Device */
#define APR_FINFO_INODE 0x00002000 /**< Inode */
#define APR_FINFO_NLINK 0x00004000 /**< Number of links */
#define APR_FINFO_TYPE 0x00008000 /**< Type */
#define APR_FINFO_USER 0x00010000 /**< User */
#define APR_FINFO_GROUP 0x00020000 /**< Group */
#define APR_FINFO_UPROT 0x00100000 /**< User protection bits */
#define APR_FINFO_GPROT 0x00200000 /**< Group protection bits */
#define APR_FINFO_WPROT 0x00400000 /**< World protection bits */
#define APR_FINFO_ICASE 0x01000000 /**< if dev is case insensitive */
#define APR_FINFO_NAME 0x02000000 /**< ->name in proper case */
#define APR_FINFO_MIN 0x00008170 /**< type, mtime, ctime, atime, size */
#define APR_FINFO_IDENT 0x00003000 /**< dev and inode */
#define APR_FINFO_OWNER 0x00030000 /**< user and group */
#define APR_FINFO_PROT 0x00700000 /**< all protections */
#define APR_FINFO_NORM 0x0073b170 /**< an atomic unix apr_stat() */
#define APR_FINFO_DIRENT 0x02000000 /**< an atomic unix apr_dir_read() */

The result of the stat() and lstat() functions would be an instance of the Python object wrapping the apr_finfo_t structure.

Finally, the req.finfo attribute would be made such that it could be assigned to by an instance of the Python object wrapping an apr_finfo_t structure. Thus:

  req.filename = newfile
  req.finfo = mod_python.apache.stat(newfile, mod_python.apache.APR_FINFO_MIN)

The affect of the assignment would be that the finfo attribute in the underlying request_rec structure would be updated from that held in the Python wrapper object for the apr_finfo_t structure instance returned by the stat() call.

This should all be quite doable and the similarity to the Apache API is preserved.

Graham Dumpleton added a comment - 25/Aug/06 01:07 AM
As final proposal, changed req.finfo to be an object where fields of underlying fields of apr_finfo_t structure are accessed as object attributes. Tuple style access is still maintained for backward compatability. The req.finfo attribute is now also writable and can be assigned with the result of calling new functions apache.stat() and apache.lstat().