SA Bugzilla – Bug 6579
TVD_PH_SUBJ_ACCOUNTS_POST hits on Twitter confirmation e-mail
Last modified: 2017-12-19 14:42:37 UTC
Hi, TVD_PH_BODY_ACCOUNTS_PRE is hitting on Twitter confirmation e-mail. I don't have a suggestion for changes to the rule, except perhaps scoring it a bit lower. I see it is set to the following: score TVD_PH_BODY_ACCOUNTS_PRE 1.201 1.527 1.327 2.393 Why is the last one (for bayes + net) so much higher than the others? Every other rule seems to lower the score for bayes + net, as they add to the score as well. Lowering this to something like 1.2 across the board should reduce the affect in FP'ing, yet allow it to be effective against real spam. I have included the full message, including headers, where it would have FP'd. In my setup, I have manually rescored the rule. Return-path: <confirmation-fbpvny=ypjfbsg.pbz-255e9@postmaster.twitter.com> Envelope-to: social@lcwsoft.com Delivery-date: Mon, 14 Feb 2011 15:14:50 -0330 Received: from mx005.twitter.com ([128.121.146.141]) by athena.lcwsoft.com with esmtp (Exim 4.69) (envelope-from <confirmation-fbpvny=ypjfbsg.pbz-255e9@postmaster.twitter.com>) id 1Pp3Pi-0002z4-AI for social@lcwsoft.com; Mon, 14 Feb 2011 15:14:46 -0330 Received: from twitter.com (localhost [127.0.0.1]) by mx005.twitter.com (Postfix) with ESMTP id 05D71A86B8B for <social@lcwsoft.com>; Mon, 14 Feb 2011 18:44:44 +0000 (UTC) X-DKIM: Sendmail DKIM Filter v2.8.2 mx005.twitter.com 05D71A86B8B DKIM-Signature: v=1; a=rsa-sha1; c=simple/simple; d=twitter.com; s=dkim; t=1297709084; i=@twitter.com; bh=0Is0j9KVIfTFXbIHO8bSH/ohy5g=; h=Date:From:Reply-To:To:Message-Id:Subject:Mime-Version: Content-Type; b=mLXEfTMDwegt0uQY5eGiv/mMUGbAFmy4rUT7CjHRtxLKD8vT2dbPCOiKi4U+iMgm7 O+XeOYhIlkUzHDygkl9gB8Z+BDv7AERp/u83/GXvRPPGjmYFSv6aGHIjNw0lSvs16r x5NIk7aa7NZqhT8HP2zZyBjauLqiSXXA4qTQW3Oo= X-DomainKeys: Sendmail DomainKeys Filter v1.0.2 mx005.twitter.com 05D71A86B8B DomainKey-Signature: a=rsa-sha1; s=default; d=twitter.com; c=simple; q=dns; b=xrpz9/w/1zWDj2xVM+vSV1xlQgNxyGIEvHzEAvvdPeI3ek/bbmtQdQihyFsMOYMiN DSMVu0YbB8wzfK39VJ7Iw== Date: Mon, 14 Feb 2011 18:44:44 +0000 From: Twitter <confirmation-fbpvny=ypjfbsg.pbz-255e9@postmaster.twitter.com> Reply-To: noreply@postmaster.twitter.com To: social@lcwsoft.com Message-Id: <4d59781c41cf_9d369067a94920f2@mx005.twitter.com.tmail> Subject: Confirm your Twitter account, lcwsoft! Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=mimepart_4d59781c4879_9d369067a94921ab X-Campaignid: twitter20080313004044 Errors-To: Twitter <confirmation-fbpvny=ypjfbsg.pbz-255e9@postmaster.twitter.com> Bounces-To: Twitter <confirmation-fbpvny=ypjfbsg.pbz-255e9@postmaster.twitter.com> X-cPanel-MailScanner-Information: Please contact the ISP for more information X-cPanel-MailScanner-ID: 1Pp3Pi-0002z4-AI X-cPanel-MailScanner: Found to be clean X-cPanel-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=2.419, required 5, BAYES_50 0.80, DKIM_SIGNED 0.10, DKIM_VALID -0.10, HTML_IMAGE_ONLY_24 1.62, HTML_MESSAGE 0.00, SPF_PASS -0.00, TVD_PH_SUBJ_ACCOUNTS_POST 0.00) X-cPanel-MailScanner-SpamScore: ss X-cPanel-MailScanner-From: confirmation-fbpvny=ypjfbsg.pbz-255e9@postmaster.twitter.com X-Spam-Status: No --mimepart_4d59781c4879_9d369067a94921ab Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: Quoted-printable Content-Disposition: inline Hi, lcwsoft. Please confirm your Twitter account by clicking this link: http://twitter.com/account/confirm_email/lcwsoft/98CB7-9HG9B-129770 Once you confirm, you will have full access to Twitter and all future not= ifications will be sent to this email address. - The Twitter Team If you received this message in error and did not sign up for a Twitter a= ccount, click on the url below: http://twitter.com/account/not_my_account/lcwsoft/98CB7-9HG9B-129770 Please do not reply to this message; it was sent from an unmonitored emai= l address. This message is a service email related to your use of Twitte= r. For general inquiries or to request support with your Twitter account= , please contact Twitter Support by visiting: http://support.twitter.com/= --mimepart_4d59781c4879_9d369067a94921ab Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: Quoted-printable Content-Disposition: inline <html> <body lang=3D"en" style=3D"background-color:#fff; color: #222"> <div style=3D"padding:14px; margin-bottom:4px; background-color:#008eb9= ; -moz-border-radius:5px;-webkit-border-radius:5px;border-radius:5px"> <a style=3D"color:#FFF" href=3D"http://twitter.com/?from=3Demailheade= r&utm_campaign=3Dtwitter20080313004044&utm_medium=3Demail&utm_source=3Dva= lidate_email"><img alt=3D"Twitter" height=3D"24" src=3D"http://a3.twimg.c= om/a/1297446951/images/twttr_bird_hd-008eb9.gif" style=3D"display:block;b= order: 0;" width=3D"130" /></a> </div> <div style=3D"font-family: 'Helvetica Neue', Arial, Helvetica, sans-ser= if; font-size:13px; margin: 14px; position:relative"> <h2 style=3D"font-family: 'Helvetica Neue', Arial, Helvetica, sans-se= rif;margin:0 0 16px; font-size:18px; font-weight:normal">Hi, lcwsoft.</h2= > <p> Please confirm your Twitter account by clicking this link:<br/> <a href=3D"http://twitter.com/account/confirm_email/lcwsoft/98CB7-9HG9B-1= 29770?utm_campaign=3Dtwitter20080313004044&utm_medium=3Demail&utm_source=3D= validate_email">http://twitter.com/account/confirm_email/lcwsoft/98CB7-9H= G9B-129770</a> </p> <p> Once you confirm, you will have full access to Twitter and all future n= otifications will be sent to this email address. </p> <p style=3D"font-family: 'Helvetica Neue', Arial, Helvetica, sans-serif;f= ont-size: 13px; line-height:18px;border-bottom: 1px solid rgb(238, 238, 2= 38); padding-bottom: 10px;"> <span style=3D"font: italic 13px Georgia,serif; color: rgb(102, 102, = 102);">The Twitter Team</span> = </p> <p style=3D"font-family: 'Helvetica Neue', Arial, Helvetica, sans-serif= ;margin-top:5px;font-size:10px;color:#888888;"> If you received this message in error and did not sign up for a Twitt= er account, click <a href=3D'http://twitter.com/account/not_my_account/lc= wsoft/98CB7-9HG9B-129770?utm_campaign=3Dtwitter20080313004044&utm_medium=3D= email&utm_source=3Dvalidate_email'>not my account</a>. </p> <p style=3D"font-family: 'Helvetica Neue', Arial, Helvetica, sans-serif;m= argin-top:5px;font-size:10px;color:#888888;"> = Please do not reply to this message; it was sent from an unmonitored em= ail address. This message is a service email related to your use of Twit= ter. For general inquiries or to request support with your Twitter accou= nt, please visit us at <a href=3D"http://support.twitter.com/">Twitter Su= pport</a>. </p> </div> </body> </html> --mimepart_4d59781c4879_9d369067a94921ab--
Wrong rule mentioned. It should be TVD_PH_SUBJ_ACCOUNTS_POST
score TVD_PH_SUBJ_ACCOUNTS_POST 2.602 2.607 2.497 3.099 Why is the last one (for bayes + net) so much higher than the others? Every other rule seems to lower the score for bayes + net, as they add to the score as well. 3.099 is too much IMO, for a system where the required spam score is 5.0
(In reply to comment #0) > > TVD_PH_BODY_ACCOUNTS_{whichever} is hitting on Twitter confirmation e-mail. I don't > have a suggestion for changes to the rule, > > X-DKIM: Sendmail DKIM Filter v2.8.2 mx005.twitter.com 05D71A86B8B How about demoting the current version to a __subrule and meta-ing it with "not from twitter" || "not valid DKIM signature" for the scored version?
Wouldn't making a specific exception for Twitter only lead to cruft? How's about simply making those rules not fire if SPF_PASS or DKIM_VALID hit? I know this may lead to more spam not being hit by those rules, but it would kill the FPs for senders with properly configured SPF or DomainKeys.
(In reply to comment #4) > Wouldn't making a specific exception for Twitter only lead to cruft? That's certainly possible. > How's > about simply making those rules not fire if SPF_PASS or DKIM_VALID hit? I know > this may lead to more spam not being hit by those rules, but it would kill the > FPs for senders with properly configured SPF or DomainKeys. That would probably be a good first step. Alternatively: !(From_Twitter && DKIM_VALID) but that may get into whack-a-mole territory for FP avoidance.
I'm only approaching this from the POV of a SA user, but I never understood why SA doesn't have a meta rule that does the following meta DNS_CHECK_PASSED (SPF_PASS || DKIM_VALID) And then use that for other rules such as any rules that shouldn't trigger when a DNS check passes. I agree that the specific exclusion for Twitter would inevitably lead to a whack-a-mole situation with other exclusions being required in the future. My solution is a bit more lenient, but would do the job. How many spam e-mails truly hit these rules, and actually pass SPF or DKIM?
This is still a problem. Twitter confirmation e-mails are hitting TVD_PH_SUBJ_ACCOUNTS_POST which is now scored at 3.1 and are getting labeled as spam (it also hits PYZOR_CHECK at 1.98 for some reason) Thoughts?
(In reply to comment #6) > I'm only approaching this from the POV of a SA user, but I never understood why > SA doesn't have a meta rule that does the following > > meta DNS_CHECK_PASSED (SPF_PASS || DKIM_VALID) > > And then use that for other rules such as any rules that shouldn't trigger when > a DNS check passes. > > I agree that the specific exclusion for Twitter would inevitably lead to a > whack-a-mole situation with other exclusions being required in the future. > > My solution is a bit more lenient, but would do the job. How many spam e-mails > truly hit these rules, and actually pass SPF or DKIM? In general, this is the point for HAM rules that score negative. However, just because something has valid SPF or DKIM, doesn't mean it isn't spam. In fact, I remember an old statistic that David Skoll with MIMEDefang mentioned that he saw this being adopted far quicker by the spammers ;-) What we need to do to solve this problem is get more legit examples of the email into the masscheck so that the system will score the rule lower because it hits on HAM by accident. Looking at the overall rule, though, this rule is very prone to misfires and needs to be capped on scoring ASAP. Theo, did you intend META all to be the only rule or did you mean to let TVD_PH_SUBJ_ACCOUNTS_POST be scored? If the TVD_PH_SUBJ_ACCOUNTS_POST and the related 4 other rules are changed to sub rules (i.e. prefixed with __), this issue becomes solved. However, if they are meant to fire alone, we need to cap the scores because of FP concerns. It looks like TVD_PH_SUBJ_META_ALL meta fires if any of the subrules hit so I believe you meant these to be subrules. But TVD_PH_SUBJ_META_ALL hasn't been promoted to active. Anyway, I believe right now we are capping scores in mass check to the scores in the sandbox. Can you add something like this to your sandbox file ASAP? score TVD_PH_SUBJ_ACCOUNTS_POST 1.0 score TVD_PH_SUBJ_ACCOUNTS_PRE 0.1 score TVD_PH_SUBJ_SEC_MEASURES 0.1 score TVD_PH_SUBJ_UPDATE 0.1 score TVD_PH_SUBJ_META_ALL 1.0 Or change to be subrules and add a max score you feel comfortable with to the meta rule?
(In reply to comment #8) > In general, this is the point for HAM rules that score negative. However, just > because something has valid SPF or DKIM, doesn't mean it isn't spam. Yep. > In fact, I remember an old statistic that David Skoll with MIMEDefang mentioned > that he saw this being adopted far quicker by the spammers ;-) Back in the 2.5 timeframe, we tried having a number of ham rules w/ good negative values. Sure enough, spammers adapted in order to his those rules. We had to pull all of them out. > What we need to do to solve this problem is get more legit examples of the > email into the masscheck so that the system will score the rule lower because > it hits on HAM by accident. +1 > Looking at the overall rule, though, this rule is very prone to misfires and > needs to be capped on scoring ASAP. > > Theo, did you intend META all to be the only rule or did you mean to let > TVD_PH_SUBJ_ACCOUNTS_POST be scored? I can't tell you what I was thinking at the time ... I wrote these rules a long time ago. :) My guess is that I was throwing things at the wall and seeing what would stick. :) As actual rules, versus subrules, the ones that did well would get promoted and the ones that didn't would just stay in the sandbox. The meta was probably an attempt to deal with FPs for some of the individual rules. Looking at a recent STATISTICS* file, it looks like : 0.099 0.1420 0.0083 0.945 0.68 0.00 TVD_PH_SUBJ_ACCOUNTS_POST So overall, that's pretty good. Can those ham hits be checked out? If they're FPs, can the rule be modified to avoid them w/out significantly dropping the spam hit rate? > Can you add something like this to your sandbox file ASAP? [...] > Or change to be subrules and add a max score you feel comfortable with to the > meta rule? I'm not really in a position to do rule development at this point, so I would say go ahead and make any changes that you feel are necessary.
Here is another Twitter e-mail that got flagged as spam, and hit the rule mentioned Return-path: <confirmation-ynjeraprjvyyvnzf=ypjfbsg.pbz-53492@postmaster.twitter.com> Envelope-to: lawrencewilliams@lcwsoft.com Delivery-date: Thu, 30 Jun 2011 20:16:32 -0230 Received: from ham-cannon.twitter.com ([199.59.148.237]) by athena.lcwsoft.com with esmtp (Exim 4.69) (envelope-from <confirmation-ynjeraprjvyyvnzf=ypjfbsg.pbz-53492@postmaster.twitter.com>) id 1QcQ02-004J1W-VJ for lawrencewilliams@lcwsoft.com; Thu, 30 Jun 2011 20:16:19 -0230 X-DKIM: Sendmail DKIM Filter v2.8.2 4090084717.twitter.com AD87F3A70A1C DKIM-Signature: v=1; a=rsa-sha1; c=simple/simple; d=twitter.com; s=dkim; t=1309473976; i=@twitter.com; bh=Go4LXa7P0uFIvmK6mYlwhvqFII4=; h=Date:From:Reply-To:To:Subject:Mime-Version:Content-Type; b=eyysdCp0zf6itWK3okwloO0ecdNw8YEL6wwLHwDGemcHSEqzWY7Wd5SkKsazCEANI uYL7b3XXKSVp0RNoku5DsjJWJOhkr9PyfxH5SUPnZJk/60xG82xBB7v7yuayq4KgK/ t3lBCjIJLrCj4JNGzci2FLevZVCobxf2oVKOV4xE= X-DomainKeys: Sendmail DomainKeys Filter v1.0.2 4090084717.twitter.com AD87F3A70A1C DomainKey-Signature: a=rsa-sha1; s=default; d=twitter.com; c=simple; q=dns; h=date:from:reply-to:to:subject:mime-version:content-type: x-campaignid:errors-to:bounces-to; b=PBsWvnAQHFB8LVLcjXNzYu5KEntmQhU0kC2mHB1VbG9IQCFifa2gNXvLNO8bAUECv qp8p9Sd45ktrTi8sGAF/w== Date: Thu, 30 Jun 2011 22:46:16 +0000 From: Twitter <confirmation-ynjeraprjvyyvnzf=ypjfbsg.pbz-53492@postmaster.twitter.com> Reply-To: noreply@postmaster.twitter.com To: lawrencewilliams@lcwsoft.com Message-Id: <4e0cfcb8ac870_8fe43c4ea988258e2@4090084717.twitter.com.tmail> Subject: {Spam?} Confirm your Twitter account, lcwsoft! Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=mimepart_4e0cfcb8acddb_8fe43c4ea988259d3 X-Campaignid: twitter20080313004044 Errors-To: Twitter <confirmation-ynjeraprjvyyvnzf=ypjfbsg.pbz-53492@postmaster.twitter.com> Bounces-To: Twitter <confirmation-ynjeraprjvyyvnzf=ypjfbsg.pbz-53492@postmaster.twitter.com> X-LCWSoft-MailScanner-Information: Please contact the ISP for more information X-LCWSoft-MailScanner-ID: 1QcQ02-004J1W-VJ X-LCWSoft-MailScanner: Found to be clean X-LCWSoft-MailScanner-SpamCheck: spam, SpamAssassin (not cached, score=5.874, required 5, DKIM_SIGNED 0.10, DKIM_VALID -0.10, HTML_IMAGE_ONLY_24 1.28, HTML_MESSAGE 0.00, PYZOR_CHECK 1.98, SPF_PASS -0.00, TVD_PH_SUBJ_ACCOUNTS_POST 2.61) X-LCWSoft-MailScanner-SpamScore: sssss X-LCWSoft-MailScanner-From: confirmation-ynjeraprjvyyvnzf=ypjfbsg.pbz-53492@postmaster.twitter.com --mimepart_4e0cfcb8acddb_8fe43c4ea988259d3 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: Quoted-printable Content-Disposition: inline Hi, lcwsoft. Please confirm your Twitter account by clicking this link: http://twitter.com/account/confirm_email/lcwsoft/HD847-97AG3-130947 Once you confirm, you will have full access to Twitter and all future not= ifications will be sent to this email address. - The Twitter Team If you received this message in error and did not sign up for a Twitter a= ccount, click on the url below: http://twitter.com/account/not_my_account/lcwsoft/HD847-97AG3-130947 Please do not reply to this message; it was sent from an unmonitored emai= l address. This message is a service email related to your use of Twitte= r. For general inquiries or to request support with your Twitter account= , please contact Twitter Support by visiting: http://support.twitter.com/= --mimepart_4e0cfcb8acddb_8fe43c4ea988259d3 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: Quoted-printable Content-Disposition: inline <html> <body lang=3D"en" style=3D"background-color:#fff; color: #222"> <div style=3D"padding:14px; margin-bottom:4px; background-color:#008eb9= ; -moz-border-radius:5px;-webkit-border-radius:5px;border-radius:5px"> <a style=3D"color:#FFF" href=3D"http://twitter.com/?from=3Demailheade= r&utm_campaign=3Dtwitter20080313004044&utm_medium=3Demail&utm_source=3Dva= lidate_email"><img alt=3D"Twitter" height=3D"24" src=3D"http://a1.twimg.c= om/a/1309465578/images/twttr_bird_hd-008eb9.gif" style=3D"display:block;b= order: 0;" width=3D"130" /></a> </div> <div style=3D"font-family: 'Helvetica Neue', Arial, Helvetica, sans-ser= if; font-size:13px; margin: 14px; position:relative"> <h2 style=3D"font-family: 'Helvetica Neue', Arial, Helvetica, sans-se= rif;margin:0 0 16px; font-size:18px; font-weight:normal">Hi, lcwsoft.</h2= > <p> Please confirm your Twitter account by clicking this link:<br/> <a href=3D"http://twitter.com/account/confirm_email/lcwsoft/HD847-97AG3-1= 30947?utm_campaign=3Dtwitter20080313004044&utm_medium=3Demail&utm_source=3D= validate_email">http://twitter.com/account/confirm_email/lcwsoft/HD847-97= AG3-130947</a> </p> <p> Once you confirm, you will have full access to Twitter and all future n= otifications will be sent to this email address. </p> <p style=3D"font-family: 'Helvetica Neue', Arial, Helvetica, sans-serif;f= ont-size: 13px; line-height:18px;border-bottom: 1px solid rgb(238, 238, 2= 38); padding-bottom: 10px; margin: 0 0 10px;"> <span style=3D"font: italic 13px Georgia,serif; color: rgb(102, 102, = 102);">The Twitter Team</span> = </p> <p style=3D"font-family: 'Helvetica Neue', Arial, Helvetica, sans-serif= ;margin-top:5px;font-size:10px;color:#888888;"> If you received this message in error and did not sign up for a Twitt= er account, click <a href=3D'http://twitter.com/account/not_my_account/lc= wsoft/HD847-97AG3-130947?utm_campaign=3Dtwitter20080313004044&utm_medium=3D= email&utm_source=3Dvalidate_email'>not my account</a>. </p> <p style=3D"font-family: 'Helvetica Neue', Arial, Helvetica, sans-serif= ;margin-top:5px;font-size:10px;color:#888888;"> = Please do not reply to this message; it was sent from an unmonitored = email address. This message is a service email related to your use of Tw= itter. For general inquiries or to request support with your Twitter acc= ount, please visit us at <a href=3D"http://support.twitter.com/">Twitter = Support</a>. </p> </div> </body> </html> --mimepart_4e0cfcb8acddb_8fe43c4ea988259d3--
Created attachment 4930 [details] false positive on TVD_PH_SUBJ_ACCOUNTS_POST From my masscheck corpus, a false-positive on TVD_PH_SUBJ_ACCOUNTS_POST
I have another FP in my masscheck corpus from one of my credit card companies. I'd rather not attach the whole thing to an open bug.
(In reply to comment #12) > I have another FP in my masscheck corpus from one of my credit card companies. > I'd rather not attach the whole thing to an open bug. That's ok. I feel comfortable enough that FPs exist that we need to tweak the rule. Just looking at the rule, it's going to fire on a lot of things. I don't have a problem with that but the score should be in line with that reality. My theory is Masscheck is scoring it too high because there isn't enough ham corpora to change things the otherway.
(In reply to comment #13) > (In reply to comment #12) > > I have another FP in my masscheck corpus from one of my credit card companies. > > I'd rather not attach the whole thing to an open bug. > > That's ok. I feel comfortable enough that FPs exist that we need to tweak the > rule. Just looking at the rule, it's going to fire on a lot of things. I > don't have a problem with that but the score should be in line with that > reality. > > My theory is Masscheck is scoring it too high because there isn't enough ham > corpora to change things the otherway. Theo, will worry about improving the Spam rate after we get the FPs under control and identified. Apologies if I take the rules in the wrong direction! svn commit -m 'Tweaks on Phishy Subject Rules from Theo to try and fix False Positives discussed in bug 6579' Sending felicity/70_phishing.cf Transmitting file data . Committed revision 1142093. Will check and see how things look after the holiday weekend. Regards, KAM
No other replies in 6 years, time to close the bz ?
Rule no longer extant in source sandbox or core rules