SA Bugzilla – Bug 4941
[review] sa-update doesn't clean up after itself on failure
Last modified: 2006-07-24 08:10:29 UTC
It looks like if sa-update fails, it doesn't clean up the directories it created. spamassassin then looks in this empty directory for its rules files, and of course, finds none. (See bug at provided URL for more detail)
a good explanation from bug 4997: 'The problem is that on the first sa-update run, the new local state directory is made at the start to hold the temp files, and then when the uncompress fails, sa-update dies leaving an empty local state dir. fwiw. :)'
Created attachment 3602 [details] fix here's a patch. it (a) defers creating the dir until everything's downloaded, gpg-verified, linted etc. (b) also attempts a "test rename" to ensure the dir can be written to, uid has permissions, etc.; if that fails, it rolls back changes to the dir and aborts the upgrade (c) makes some "write failed" error cases from try-again warnings into fatal errors, which they should be (d) caught a missing closedir().
setting review state for 3.1.x. checked in as r424325 for trunk.
This could just be my "not quite awake" state, but ... UPDTmp is under UPDDir: my $UPDDir = "$opt{'updatedir'}/$nicechannel"; my $UPDTmp = "$opt{'updatedir'}/$nicechannel.tmp"; So since UPDTmp is still created at the beginning of the process, $UPDDir will therefore also be created, and so the problem will still exist.
nope -- UPDTmp ("/path/foo") is parallel to UPDDir ("/path/foo.tmp"). right?
(In reply to comment #5) > nope -- UPDTmp ("/path/foo") is parallel to UPDDir ("/path/foo.tmp"). right? Except the problem is that /path (generally called the local state dir) exists. ;)
oh wait a sec -- I get your point. UPDTmp is under 'updatedir', rather than UPDDir, and that's the dir that Mail::SpamAssassin checks for -- so even with this additional code, it'll still try to use the empty dir (even with just temp files in it). good point. :( ok -- we need an additional fix. I suggest using '__local_state_dir__/spamassassin/__version__.tmp/$nicechannel' as the temp dir for '__local_state_dir__/spamassassin/__version__/$nicechannel'. sound good?
(In reply to comment #7) > oh wait a sec -- I get your point. UPDTmp is under 'updatedir', rather than > UPDDir, and that's the dir that Mail::SpamAssassin checks for -- so even with > this additional code, it'll still try to use the empty dir (even with just temp > files in it). Yes! Right, sorry -- wasn't quite awake yet so that previous comment was a bit off, but the idea got through! :) > good point. :( ok -- we need an additional fix. > > I suggest using '__local_state_dir__/spamassassin/__version__.tmp/$nicechannel' > as the temp dir for '__local_state_dir__/spamassassin/__version__/$nicechannel'. > sound good? I think you mean: __local_state_dir__.tmp/$nicechannel since __local_state_dir__ includes "spamassassin/__version__". :) I've been thinking about this for a bit (just haven't had a chance to code up a patch) and what I ended up with is something like this: - If exists, read in MIRRORED.BY and track the mtime. Otherwise download a new one and flag that it was downloaded. - Create UPDTmp under File::Spec->tmpdir() - Unarchive the update to UPDTmp, test it, and remove the directory. - If error, throw error and go onto next channel. - Clean out $UPDDir or make it if necessary (at this point local_state_dir exists) - Unarchive the update to UPDDir - If not downloaded, check for a new MIRRORED.BY file - Write out MIRRORED.BY file I think this minimizes the possibility of this problem occuring.
Created attachment 3603 [details] additional fix no, here's what I mean ;) basically it uses '$opt{'updatedir'}.tmp/$nicechannel', which in effect means: [17852] dbg: channel: update directory /var/lib/spamassassin/3.002000/updates_spamassassin_org [17852] dbg: channel: update tmp directory /var/lib/spamassassin/3.002000.tmp/updates_spamassassin_org when M:SA goes looking for rules in '/var/lib/spamassassin/3.002000', they now won't be there until they're ready.
btw, I should point out -- there's a reason I'm attempting to keep the tmpdir as near as possible to the "finished" location. It's an attempt to ensure that the two dirs are on the same filesystem, which minimises the possibility of issues during the final "mv" step.
(In reply to comment #9) > basically it uses '$opt{'updatedir'}.tmp/$nicechannel', which in effect means: > > [17852] dbg: channel: update directory > /var/lib/spamassassin/3.002000/updates_spamassassin_org > [17852] dbg: channel: update tmp directory > /var/lib/spamassassin/3.002000.tmp/updates_spamassassin_org > > when M:SA goes looking for rules in '/var/lib/spamassassin/3.002000', they now > won't be there until they're ready. Hrm. My only issue with that is that when UPDTmp is removed, /var/lib/spamassassin/3.002000.tmp is leftover. I'd rather just use that as the temp dir, the heck with $nicechannel. :)
that's a good point. hmm. however, if we were to use just /var/lib/spamassassin/3.002000.tmp, that'd mean that only one "sa-update" can run at the same time, even for multiple channels.
(In reply to comment #12) > that's a good point. hmm. however, if we were to use just > /var/lib/spamassassin/3.002000.tmp, that'd mean that only one "sa-update" can > run at the same time, even for multiple channels. Hrm. I hadn't thought of people running sa-update separately for multiple channels, since sa-update can already handle that in a single run (it's serial, but ...) I still like my method from comment 8 which fixes the issue, even if you're running multiple sa-update commands at the same time. :)
(In reply to comment #13) > I still like my method from comment 8 which fixes the issue, even if you're > running multiple sa-update commands at the same time. :) Ok, I hope you don't mind but I ended up spending a few hours this evening making a bunch of changes to sa-update. It implements the method from comment 8, strips out some now unused code, and cuts out a bunch of temp files. I'm making some finishing touches to it and will commit it to 3.2, then make a patch available here backporting the 3.2 version to 3.1.4.
ok, patch committed to 3.2: Sending lib/Mail/SpamAssassin/Util.pm Sending sa-update.raw Transmitting file data .. Committed revision 424521. it fixes the issue here, and also another I found which caused update .pre files to be ignored. Patch for 3.1 shortly.
Created attachment 3604 [details] suggested patch
wow! +1 ;) there's one issue -- namely that if something stops the files being extracted in the second taint_safe_archive_extract() call (such as an out-of-space condition), those files will be left behind in an incomplete state in $UPDDir. I'd like to see those rolled back too, for safety. Also, IMO, it should simply copy the files from the tmpdir, instead of re-extracting them from the tarball again -- which is inefficient. cross_fs_rename() would be useful to keep for that job. But both of those are icing on the cake, and best considered a separate issue, I think... good patch ;)
(In reply to comment #17) > wow! > +1 ;) :) > there's one issue -- namely that if something stops the files being extracted in > the second taint_safe_archive_extract() call (such as an out-of-space > condition), those files will be left behind in an incomplete state in $UPDDir. > I'd like to see those rolled back too, for safety. Fair enough. I'll put up a new version in a minute which covers this and a few other related things. I'm also swapping around "$dir/$file" with File::Spec->catfile($dir, $file) since we really ought to do that more often. ;) > Also, IMO, it should simply copy the files from the tmpdir, instead of > re-extracting them from the tarball again -- which is inefficient. > cross_fs_rename() would be useful to keep for that job. I could go either way I guess. It seems to be easier, though requires more work and disk space, to just unarchive again than to go through the process of moving files from the temp dir. I liked getting rid of File::Copy as a requirement though. ;) BTW: why cross_fs_rename() instead of File::Copy::move() ? > But both of those are icing on the cake, and best considered a separate issue, > I think... good patch ;) :) Thanks! Sorry for coming up from behind and redoing it on you. This is one of those bugs that have been floating around in my head for a while, and I just needed to sit down and focus on getting a patch out.
Created attachment 3605 [details] suggested patch A continuation of the last patch -- address JM's issue of cleaning out the update directory if the extraction fails. I also cleaned up a few other things that have been annoying me about sa-update, etc. Enjoy. :)
'BTW: why cross_fs_rename() instead of File::Copy::move() ?' ah, missed that ;)
+1 on the newest patch, too
+1 on the new patch
Sending lib/Mail/SpamAssassin/Util.pm Sending sa-update.raw Transmitting file data .. Committed revision 425085. :)