SA Bugzilla – Bug 4829
[review] Using each to iterate through arrays then exiting early causes problems
Last modified: 2006-04-09 06:44:41 UTC
While trying to debug why my plugin from bug 4255 was sometimes not hitting on certain rules I came across the behavior where the "while(...each...)" wasn't going through all of the entries in the hash table. After some digging, I found that somewhere in SA, we start iterating through the $permsgstatus-> {html}->{uri_detail} hash via each and exit early, presumably when a rule hits. However, perl keeps the iteration state when using each(), and then when we try doing each over the same array in another section of code, we miss hash entries. To see what I mean: $ perl -e '%t=map{$_=>1}qw/1 2 3/; while(($k,$v)=each %t){print "1: $k\n"; last;} while(($k,$v)=each % t){print "2: $k\n";}' 1: 1 2: 3 2: 2
found a few EvalTests which could cause this problem. patch to be attached shortly. I put in a slightly larger version into 3.2.
Created attachment 3419 [details] suggested patch
+1
IMO that fix is a bit kludgy! sorry Theo ;) but +1 anyway, since it does the job. Here's an *addition* -- calling "keys %t" to reset the iterator after the loop. I think this is cleaner and less brittle, since it doesn't rely on completing a full set of iterations to clear the iterator state. as you can see, it works: perl -e '%t=map{$_=>1}qw/1 2 3/; while(($k,$v)=each %t){print "1: $k\n"; last;} keys %t; while(($k,$v)=each %t){print "2: $k\n";}' I'll attach the patch, anyway, but I don't think it's a biggie if this gets into 3.1.2 as well as 3.2.0, since it's just a cleanup rather than the more important bugfix that Theo's patch represents. it's cumulative on top of that patch btw. btw I also checked for other cases where this can cause trouble elsewhere in the code, looks like we're clear.
Created attachment 3443 [details] addition
added to 3.2.0 btw: Sending lib/Mail/SpamAssassin/EvalTests.pm Transmitting file data . Committed revision 390993.
(In reply to comment #4) > IMO that fix is a bit kludgy! sorry Theo ;) but +1 anyway, since it does the job. Heh. > Here's an *addition* -- calling "keys %t" to reset the iterator after > the loop. I think this is cleaner and less brittle, since it doesn't rely on > completing a full set of iterations to clear the iterator state. eww! actually I think that's way more klugy! (sorry) the next() version is just not the most efficient it could be (though the URL list ought to be pretty small). well, either way, as long as the problem gets fixed. we can argue about which is the cleanest/most efficient solution later. ;)
well, fwiw, here's a better suggestion from Aristotle Pagaltzis -- s/return foo/keys %hash;return foo/, and similarly s/last/keys %hash;last/. in other words, just fix the exact spots where we break out of the loop (through return or last), so that they reset the iterator just before their break; that way, the reset and the break are colocated beside each other, and hard to accidentally disassociate in future.
Regarding comment #8, is putting keys %hash inside the loop before exiting guaranteed to work to reset the iterator even though the loop has not yet been exited? If the perl language spec says it has to work, then I'm +1 for doing it that way.
Ok, I've verified that keys %t and values %t both are defined to have as side effect resetting the iterator, and there is no other specific operator for doing that. I'm +1 for the suggestion in comment #8 as long as a comment is put alongside the two keys calls saying that it is to reset the iterator.
Created attachment 3464 [details] Patch against 3.1 branch What do you think of this simpler patch? From comment #8: "in other words, just fix the exact spots where we break out of the loop (through return or last)" If we break out of the loop with a return, there is no reason to use last. All we have to do is insert the keys %hash before the return 1. The return 0 doesn't need it and we don't need a last.
(In reply to comment #11) > What do you think of this simpler patch? I still think the keys/values thing is a bit hackish, but I do like the more concise patch.
+1 on 3464 -- it does look a little hacky, but that's the most concise way perl lets us do it. it really is a silly feature IMO :(
Committed to 3.1 revision 392707. My apologies if Theo did not intend comment #12 to be a vote. Also committed this version to trunk revision 392708.