We would recommend that for upload of files to a repository that the following cases be handled to provide greater robustness.
1.) All uploads be to a "staging" area, this staging area could be the same directory or a temp directory and would upload the file with the file name extension of
Henk Penning comments:
> That would be great.
> I think, the best way for adding/replace stuff is
> – write a 'temp'
> – rename 'temp' to 'file'
> because a rename is truly atomic if 'temp' and 'file' are
> in the same file system.
> If you can implement the 'temp' for 'file' to be,
> for instance, '.tmp.file', I can easily teach the checkers
> to ignore '.tmp.*' files. I think rsync does something
> like that (even better .tmp.$$.file).
So the goals here are to verify that rsync handles ".tmp.$$.file" which will stop it from attempting to sync partial uploads. Henk can alter the md5 checking utilities at Apache to postpone checking .tmp files md5 signatures.
2.) All file permissions on uploaded files would best handled to be only writable by the individual user, not writable by group and readable by all. All directory permissions should be writable for user and group and readable by all. This forces the following implementation to be required.
Any file upload that attempts to overwrite a file should instead, move that file out of the way to a temporary location, upload to the new file using strategy (1) and then name it to the old file, once this is completed the old file can be removed. This provides a means be which file "ownership" can be determined and maintained. The problem this solves is the following, if files are "group writable" then any individual in the group can overwite the file altering its contents, historically we cannot tell who actually made the alteration. If there are concerns about the integrity of the artifact or its signature, it is unclear who was responsible for the alteration.