Bug 6659 - Empty Content-Type causes learning of binary file
Summary: Empty Content-Type causes learning of binary file
Status: RESOLVED WORKSFORME
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Learner (show other bugs)
Version: 3.3.1
Hardware: PC Linux
: P2 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-16 14:51 UTC by Joolee
Modified: 2019-06-24 12:44 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status
One of the messages that "fail" !WARNING! The attachment is probably a virus! text/plain None Joolee [NoCLA]
Demo mail 2 text/plain None Joolee [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Joolee 2011-09-16 14:51:50 UTC
I'm receiving a few hundred mails a day with small attachments that are (afaik) correctly parsed and nothing happens with the attachments when checking the message to be spam. When autolearning the E-mails as spam, the attachments are being decoded and parsed by the bayes algorithm. The only strange thing I can find in the message (appart from the text content obviously being a phishing mail) is the header of the attachment part:

------=_NextPart_000_0006_01CC51AC.63F30F00
Content-Type: ;
        name="report_1609.pdf.zip"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
        filename="report_1609.pdf.zip"

I suspect the empty content-type causes the attachment to be decoded.

Running spamassassin in debug mode causes it to hang a the following lines:
Sep 16 15:07:12.279 [8264] dbg: bayes: Using userid: 1
Sep 16 15:08:48.746 [8264] dbg: bayes: seen (bf76e190b8121487c91051758a402dd20b18eaa6@sa_generated) put

Manually calling sa-learn hangs for a while at the "decoding base64" part:
Sep 16 15:34:12.786 [18308] dbg: message: decoding base64
Forgot tokens from 1 message(s) (1 message(s) examined)
Sep 16 15:35:49.764 [18308] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x3891ba0) implements 'learner_close', priority 0
Comment 1 Joolee 2011-09-16 14:55:59 UTC
Created attachment 4959 [details]
One of the messages that "fail" !WARNING! The attachment is probably a virus!
Comment 2 Joolee 2011-10-07 07:48:34 UTC
Created attachment 4978 [details]
Demo mail 2

!Warning! Attachment probably a virus!
Comment 3 Joolee 2011-10-07 07:50:51 UTC
It seems my theory was correct. I've received a set of E-mails with other characteristics that produces the same result (extremely long sa-learn times)

These E-mails also contain an attachment with an empty content-type header:

------=_NextPart_000_0006_01CC519D.0905EB80
Content-Type: ;
name="Uniform traffic ticket.zip"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="Uniform traffic ticket.zip"
Comment 4 Henrik Krohns 2019-06-24 12:44:23 UTC
Doesn't seem to be a problem in current version, closing.