Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
3.0.0
-
None
-
James from svn-trunk 2005-08-01.
MySQL 4.0
Description
Got this exception for every incoming mail:
02/08/05 00:39:25 INFO James.Mailet: BayesianAnalysis: Exception: java.lang.Integer
java.lang.ClassCastException: java.lang.Integer
at org.apache.james.util.BayesianAnalyzer.getTokenProbabilityStrengths(BayesianAnalyzer.java:591)
at org.apache.james.util.BayesianAnalyzer.computeSpamProbability(BayesianAnalyzer.java:340)
at org.apache.james.transport.mailets.BayesianAnalysis.service(BayesianAnalysis.java:289)
at org.apache.james.transport.LinearProcessor.service(LinearProcessor.java:407)
at org.apache.james.transport.JamesSpoolManager.process(JamesSpoolManager.java:460)
at org.apache.james.transport.JamesSpoolManager.run(JamesSpoolManager.java:369)
at java.lang.Thread.run(Unknown Source)
If I clean my spam/ham db the exceptions disappears but they start again when the spam/ham db become large.
My bayesiananalysis_spam contains 200000 rows.
The following are the spam tokens with higher "occurrences".
--------------------------------------+
token | occurrences |
--------------------------------------+
3D | 82151 |
a | 59953 |
the | 45295 |
FONT | 42771 |
Content-Type | 39058 |
to | 36626 |
com | 32902 |
http | 32886 |
of | 32504 |
font | 31803 |
and | 31577 |
Content-Transfer-Encoding | 31576 |
p | 29746 |
text | 29482 |
in | 29418 |
it | 28498 |
br | 28037 |
DIV | 27431 |
I gave a careful look to the code and couldn't find anything wrong. I have a spam table with more than 258000 rows and everything works fine for me.
IMHO a possible explanation of Stefano's exceptions is the following:
The ham/spam corpus hashmaps may take a lot of memory. Accordingly, I gave a lot of -Xmx memory to the JVM.
I remember some time ago, in a java (non James) application, an unpredictable JVM behaviour (strange exceptions thrown) when the available heap was just about the needed heap. Decreasing a little bit the -Xmx size I was getting OutOfMemoryError, and increasing it everything was fine.
Stefano, can you try with more memory?