Bug 4829 - [review] Using each to iterate through arrays then exiting early causes problems
Summary: [review] Using each to iterate through arrays then exiting early causes problems
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: 3.1.1
Hardware: Other other
: P5 major
Target Milestone: 3.1.2
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-03-17 04:10 UTC by Theo Van Dinter
Modified: 2006-04-09 06:44 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status
suggested patch patch None Theo Van Dinter [HasCLA]
addition patch None Justin Mason [HasCLA]
Patch against 3.1 branch patch None Sidney Markowitz [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Theo Van Dinter 2006-03-17 04:10:01 UTC
While trying to debug why my plugin from bug 4255 was sometimes not hitting on certain rules I came 
across the behavior where the "while(...each...)" wasn't going through all of the entries in the hash table.

After some digging, I found that somewhere in SA, we start iterating through the $permsgstatus->
{html}->{uri_detail} hash via each and exit early, presumably when a rule hits.  However, perl keeps the 
iteration state when using each(), and then when we try doing each over the same array in another 
section of code, we miss hash entries.  To see what I mean:

$ perl -e '%t=map{$_=>1}qw/1 2 3/; while(($k,$v)=each %t){print "1: $k\n"; last;} while(($k,$v)=each %
t){print "2: $k\n";}'
1: 1
2: 3
2: 2
Comment 1 Theo Van Dinter 2006-03-18 07:18:49 UTC
found a few EvalTests which could cause this problem.  patch to be attached shortly.  I put in a slightly 
larger version into 3.2.
Comment 2 Theo Van Dinter 2006-03-18 07:20:51 UTC
Created attachment 3419 [details]
suggested patch
Comment 3 Daryl C. W. O'Shea 2006-03-31 03:09:02 UTC
+1
Comment 4 Justin Mason 2006-04-03 11:23:27 UTC
IMO that fix is a bit kludgy!  sorry Theo ;) but +1 anyway, since it does the job.

Here's an *addition* -- calling "keys %t" to reset the iterator after
the loop.  I think this is cleaner and less brittle, since it doesn't rely on
completing a full set of iterations to clear the iterator state.

as you can see, it works:

  perl -e '%t=map{$_=>1}qw/1 2 3/; while(($k,$v)=each %t){print "1: $k\n";
last;} keys %t; while(($k,$v)=each %t){print "2: $k\n";}'

I'll attach the patch, anyway, but I don't think it's a biggie if this gets into
3.1.2 as well as 3.2.0, since it's just a cleanup rather than the more important
bugfix that Theo's patch represents.  it's cumulative on top of that patch btw.

btw I also checked for other cases where this can cause trouble elsewhere in the
code, looks like we're clear.
Comment 5 Justin Mason 2006-04-03 11:24:07 UTC
Created attachment 3443 [details]
addition
Comment 6 Justin Mason 2006-04-03 11:25:41 UTC
added to 3.2.0 btw:

Sending        lib/Mail/SpamAssassin/EvalTests.pm
Transmitting file data .
Committed revision 390993.
Comment 7 Theo Van Dinter 2006-04-03 17:14:35 UTC
(In reply to comment #4)
> IMO that fix is a bit kludgy!  sorry Theo ;) but +1 anyway, since it does the job.

Heh.

> Here's an *addition* -- calling "keys %t" to reset the iterator after
> the loop.  I think this is cleaner and less brittle, since it doesn't rely on
> completing a full set of iterations to clear the iterator state.

eww!  actually I think that's way more klugy!  (sorry)  the next() version is just not the most efficient it 
could be (though the URL list ought to be pretty small).

well, either way, as long as the problem gets fixed.  we can argue about which is the cleanest/most 
efficient solution later. ;)
Comment 8 Justin Mason 2006-04-03 17:52:07 UTC
well, fwiw, here's a better suggestion from Aristotle Pagaltzis -- s/return
foo/keys %hash;return foo/, and similarly s/last/keys %hash;last/.  in other
words, just fix the exact spots where we break out of the loop (through return
or last), so that they reset the iterator just before their break; that way, the
reset and the break are colocated beside each other, and hard to accidentally
disassociate in future.
Comment 9 Sidney Markowitz 2006-04-08 01:49:35 UTC
Regarding comment #8, is putting keys %hash inside the loop before exiting
guaranteed to work to reset the iterator even though the loop has not yet been
exited?

If the perl language spec says it has to work, then I'm +1 for doing it that way.
Comment 10 Sidney Markowitz 2006-04-08 23:41:58 UTC
Ok, I've verified that keys %t and values %t both are defined to have as side
effect resetting the iterator, and there is no other specific operator for doing
that.

I'm +1 for the suggestion in comment #8 as long as a comment is put alongside
the two keys calls saying that it is to reset the iterator.
Comment 11 Sidney Markowitz 2006-04-09 00:51:23 UTC
Created attachment 3464 [details]
Patch against 3.1 branch


What do you think of this simpler patch?

From comment #8: "in other words, just fix the exact spots where we break out
of the loop (through return or last)"

If we break out of the loop with a return, there is no reason to use last. All
we have to do is insert the keys %hash before the return 1. The return 0
doesn't need it and we don't need a last.
Comment 12 Theo Van Dinter 2006-04-09 01:21:12 UTC
(In reply to comment #11)
> What do you think of this simpler patch?

I still think the keys/values thing is a bit hackish, but I do like the more concise patch.
Comment 13 Justin Mason 2006-04-09 13:05:30 UTC
+1 on 3464 -- it does look a little hacky, but that's the most concise
way perl lets us do it.

it really is a silly feature IMO :(
Comment 14 Sidney Markowitz 2006-04-09 13:44:41 UTC
Committed to 3.1 revision 392707.

My apologies if Theo did not intend comment #12 to be a vote.

Also committed this version to trunk revision 392708.