|
SA Bugzilla – Full Text Bug Listing |
Summary: | revert 'rawbody' rule type to split paragraph blocks at 1024 chars, to avoid DOS problems | ||
---|---|---|---|
Product: | Spamassassin | Reporter: | Justin Mason <jm> |
Component: | Libraries | Assignee: | SpamAssassin Developer Mailing List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | major | CC: | bugzilla.spamassassin.org |
Priority: | P1 | ||
Version: | unspecified | ||
Target Milestone: | 3.3.0 | ||
Hardware: | Other | ||
OS: | other | ||
Whiteboard: | |||
Bug Depends on: | |||
Bug Blocks: | 5041, 5425, 5486, 5644, 5795 | ||
Attachments: | equivalent patch for 3.2 |
Description
Justin Mason
2007-11-13 02:20:40 UTC
I have no objection to reverting raw rules to per-line regexps BY DEFAULT. After all, most existing rules were written for that regieme. But if the rule is reverted to per-line, can we than have raw:all or raw:full or raw:morethanaline or some such to retain this capability for those few rules that would like to specifically run in this mode? (In reply to comment #1) > I have no objection to reverting raw rules to per-line regexps BY DEFAULT. > After all, most existing rules were written for that regieme. > > But if the rule is reverted to per-line, can we than have raw:all or raw:full > or raw:morethanaline or some such to retain this capability for those few > rules that would like to specifically run in this mode? Yeah; probably the best bet would be to add a new rule type. upping priority. committers: anyone disagree with this plan? Introduced a new subroutine Message::split_into_array_of_short_lines to nicely split a text into array of paragraph chunks of sizes between 1 kB and 2 kB, returning the resulting array. Called from get_decoded_body_text_array() and in Plugin::BodyEval::_check_stock_info (bug 5644, similar to proposed patch there). The more degenerated a mail message text is, the more pronounced a speedup is. Shown elapsed times in seconds for priority 0 tests, as reported in the 'timing' debug line. tests_pri_0: 46 -> 45 s (sample message from bug 5041, s1) (no speedup) tests_pri_0: 62 -> 49 s (sample message from bug 5041, s2) tests_pri_0: 486 -> 82 s (sample message from bug 5041, s3) tests_pri_0: 104 -> 10 s (sample message from bug 5644) tests_pri_0: 22 -> 11 s (sample message from bug 5486) tests_pri_0: 5.7 > 3.8 s (sample message from bug 5795, s1) tests_pri_0: 5.8 > 2.6 s (sample message from bug 5795, s2) tests_pri_0: 38 -> 25 s (0.msg one of our today's messages) tests_pri_0: 12 -> 8 s (1.msg) tests_pri_0: 8.2 4.3 (2.msg) tests_pri_0: 8.0 > 7.6 (3.msg) tests_pri_0: 8.8 > 5.8 (4.msg) $ svn -m 'New sub Message::split_into_array_of_short_lines to nicely split a text into array of paragraph chunks of sizes between 1 kB and 2 kB; bugs: 5717, 5644, 5795, 5486, 5801, 5041' ci Sending SpamAssassin/Message/Node.pm Sending SpamAssassin/Message.pm Sending SpamAssassin/Plugin/BodyEval.pm Committed revision 629888. > Introduced a new subroutine Message::split_into_array_of_short_lines
> $ svn -m 'New sub Message::split_into_array_of_short_lines to nicely...
typo, the name is actually:
split_into_array_of_short_paragraphs
excellent! You kept that one secret ;) This fix is fine by me. +1 btw, I checked to see just how long a 1k paragraph is. here's a sample, which seems like quite a reasonable length: 'This is a demonstration of how long a paragraph size 1024 characters is; it seems reasonably long if you ask me. Great! This is a demonstration of how long a paragraph size 1024 characters is; it seems reasonably long if you ask me. Great! This is a demonstration of how long a paragraph size 1024 characters is; it seems reasonably long if you ask me. Great! This is a demonstration of how long a paragraph size 1024 characters is; it seems reasonably long if you ask me. Great! This is a demonstration of how long a paragraph size 1024 characters is; it seems reasonably long if you ask me. Great! This is a demonstration of how long a paragraph size 1024 characters is; it seems reasonably long if you ask me. Great! This is a demonstration of how long a paragraph size 1024 characters is; it seems reasonably long if you ask me. Great! This is a demonstration of how long a paragraph size 1024 characters is; it seems reasonably long if you ask me. Great! This is a demonstration of how long a paragraph size 1024 characters' btw, I suggest we either close this bug as FIXED, now that it's applied to 3.3.0, or retarget at 3.2.5 if we agree that this should be applied to that branch. (Personally I favour the former option. I think the change is too major for 3.2.x maintainance.) once this is closed, we can close bugs 5041, 5486, 5644 and 5795, right? > btw, I suggest we either close this bug as FIXED, now that it's applied to > 3.3.0, or retarget at 3.2.5 if we agree that this should be applied to that > branch. Fine with me, one way or the other. > once this is closed, we can close bugs 5041, 5486, 5644 and 5795, right? 5041 not resolved 5486 ok, can be closed 5644 ok, can be closed 5795 ok, can be closed The bug 5041 situation is even more likely now, as the current default SA size limit (500 KB according to spamc man page) is now above the default chunk size to which Outlook chops its message/partial chunks. (In reply to comment #8) > The bug 5041 situation is even more likely now, as the current default > SA size limit (500 KB according to spamc man page) is now above the > default chunk size to which Outlook chops its message/partial chunks. ok, as I suggest there, we should at least kill off the easily-fixed part of that -- not scanning message/partial. not sure what to do with the rest though. :( (In reply to comment #7) > (Personally I favour the former option. I think the change is too major for > 3.2.x maintainance.) +1 Created attachment 4267 [details]
equivalent patch for 3.2
Just in case someone wants this fix for 3.2, here is
the patch.
ok, let's close it. thanks Mark!(In reply to comment #8) > > btw, I suggest we either close this bug as FIXED, now that it's applied to > > 3.3.0, or retarget at 3.2.5 if we agree that this should be applied to that > > branch. > > Fine with me, one way or the other. ok, closing! thanks Mark... > > > btw, I suggest we either close this bug as FIXED, now that it's applied to
> > > 3.3.0, or retarget at 3.2.5 if we agree that this should be applied to that
> > > branch.
> ok, closing!
This entry is targeted at 3.2.5, yet it is closed and a change is only in 3.3.
Wrong target?
(In reply to comment #13) > > > > btw, I suggest we either close this bug as FIXED, now that it's applied to > > > > 3.3.0, or retarget at 3.2.5 if we agree that this should be applied to that branch. > > ok, closing! > > This entry is targeted at 3.2.5, yet it is closed and a change is only in 3.3. > Wrong target? yep; I retargeted prematurely and then forgot to undo that. |