SA Bugzilla – Bug 6083
sa-update should periodically validate MIRRORED.BY files
Last modified: 2009-09-15 14:41:11 UTC
I was noticing that after the initial update installation, sa-update will not try to update MIRRORED.BY files unless there is an update to download. This means that channels w/ infrequent updates will have clients using potentially outdated/changed MIRRORED.BY files. That then leads to a potential situation where all the old mirrors are no longer function, and updates will fail. ie: 1- sa-update run completes, gets MIRRORED.BY file from mirrors.[channel] DNS pointer, stores it. entries are for serverX, serverY, and serverZ. 2- time goes on, no channel updates are published, but the previous servers are replaced by serverA, serverB, and serverC and the MIRRORED.BY file is updated. 3- eventually an update is published and DNS updated. 4- client machines see they have a cached MIRRORED.BY file so try to download updates from serverX, serverY, and serverZ, all which fail. at this point, sa-update fails the channel as no mirrors are available. So I would suggest that step 3.5 be inserted such that if the MIRRORED.BY timestamp is old (say, >30d?,) an update attempt is made using the mirrors (current method) w/ failback to mirrors.[channel] if necessary. It would also be useful, probably, to try this if the channel download fails due to all mirrors failing.
Target Milestone 3.3, has been reported on the users list twice already.
Unfortunately, this is *not* a potential issue, as mentioned in comment 0. There are outdated MIRRORED.BY files (even for SA 3.2.5) out there, with a single mirror only. Which, coincidentally, has been removed earlier this year. See the users list as of yesterday/today. Of course, we can't do anything about those that exist already, since a fix release will result in a new, versioned update dir and fresh mirrors. The only cure for those is to rm the MIRRORED.BY file. Granted, with currently 2 mirrors, this is less likely to occur with future installs. Yet not a potential issue, but a real-life problem.
Why does SA even use a mirrors file instead of direct TXT records? Here's a live example with a dozen round-robin TXT entries, including the longest real-life mirror I could find plus a ridiculously long example URL. Am I missing some piece of the RFC that prohibits this? We're already delayed by the propagation time of the versions, so that doesn't affect anything... $ host -t txt mirrors.testtxt.khopesh.com. |sort |perl -pne 's/^.*"([^"]+)"$/$1/' http://abcdefghijklmnopqrstuvwxyz.abcdefghijklmnopqrstuvwxyz.museum/abcdefghijklmnopqrstuvwxyz/abcdefghijklmnopqrstuvwxyz/this-is-a-ridiculously-long-sa-update-channel-name-with-tons-and-tons-of-text.cf weight=99999999999999999 http://daryl.dostech.ca/sa-update/sare/72_sare_redirect_post3.0.0.cf weight=500 http://mirror-03.example.com/testtxt weight=5 http://mirror-04.example.com/testtxt weight=4 http://mirror-05.example.com/testtxt weight=4 http://mirror-06.example.com/testtxt weight=4 http://mirror-07.example.com/testtxt weight=4 http://mirror-08.example.com/testtxt weight=2 http://mirror-09.example.com/testtxt weight=2 http://mirror-10.example.com/testtxt weight=2 http://mirror-11.example.com/testtxt weight=1 http://mirror-12.example.com/testtxt weight=1 $ Every time there is an update, the mirrors should probably be re-cached. sa-update should probably also spit out a warning when there is only one mirror.
I don't recall all of the details at the moment, but I think the main concern I had was that the length of the information in the mirby file would cause lots of TCP DNS queries. The plan was DNS for small/quick things and HTTP for larger bits. sure enough: $ host -t txt mirrors.testtxt.khopesh.com ;; Truncated, retrying in TCP mode. mirrors.testtxt.khopesh.com descriptive text "http://daryl.dostech.ca/sa-update/sare/72_sare_redirect_post3.0.0.cf weight=500" mirrors.testtxt.khopesh.com descriptive text "http://abcdefghijklmnopqrstuvwxyz.abcdefghijklmnopqrstuvwxyz.museum/abcdefghijklmnopqrstuvwxyz/abcdefghijklmnopqrstuvwxyz/this-is-a-ridiculously-long-sa-update-channel-name-with-tons-and-tons-of-text.cf weight=99999999999999999" [...] (In reply to comment #3) > Why does SA even use a mirrors file instead of direct TXT records? Here's a > live example with a dozen round-robin TXT entries, including the longest > real-life mirror I could find plus a ridiculously long example URL. Am I > missing some piece of the RFC that prohibits this? We're already delayed by > the propagation time of the versions, so that doesn't affect anything...
would this need to be fixed before 3.3.0? if so set pri to P1.
Priority 1, not an enhancement. This currently bites again due to an expired third-party mirror domain.
(In reply to comment #6) > Priority 1, not an enhancement. This currently bites again due to an expired > third-party mirror domain. btw, bear in mind that the expired domain (sa-updates.com) are not being used for updates.spamassassin.org, just for third-party rulesets (Daryl's own SARE updates). so not quite so urgent. (I made that mistake myself but caught it before I filed the bug ;)
also, fwiw, I suggest we re-get MIRRORED.BY if it's older than 7 days.
I suggest to publish the location of known-good mirror files by DNS via TXT records, so you do one DNS query to find out where the MIRRORED.BY file is, then retrieve it (eg. via HTTP), and go from there.
(In reply to comment #7) > btw, bear in mind that the expired domain (sa-updates.com) are not being used > for updates.spamassassin.org, just for third-party rulesets (Daryl's own SARE > updates). so not quite so urgent. I know, and I clearly stated that in comment 6, didn't I? ;) The reason for Priority 1 (and a normal Severity, mind you) is just according to comment 5, and the fact that this is the second instance where lots of MIRRORED.BY files got invalid and need manual admin intervention.
(In reply to comment #9) > I suggest to publish the location of known-good mirror files by DNS via TXT > records, so you do one DNS query to find out where the MIRRORED.BY file is, > then retrieve it (eg. via HTTP), and go from there. that, in fact, is exactly what it does right now. ;) It just doesn't re-retrieve it after it's been retrieved once.
(In reply to comment #8) > also, fwiw, I suggest we re-get MIRRORED.BY if it's older than 7 days. Minor objection: I have modified my MIRRORED.BY files to use the Coral cache network; having sa-update do this would force me to revisit the MIRRORED.BY files weekly, or set up a cron job to touch them and keep them "fresh" (which would break the welcome self-repairing aspect of this suggestion). I've added Bug 6181 to make this less of a problem.
some flight time gave me a chance to do this ;) : 234...; svn commit -m "bug 6083: re-download MIRRRORED.BY files at least once a week, or if 'sa-update --refreshmirrors' switch is used" sa-update.raw Sending sa-update.raw Transmitting file data . Committed revision 815500.