[JAMES-1216] [gsoc2011] Design and implement machine learning filters and categorization for mail - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
- gsoc2011

Description

Context: Anti-spam functionality based on SpamAssassin is available at James (base on mailets http://james.apache.org/mailet). Bayesian mailets are also available, but not completely integrated/documented. Nothing is available to automatically categorize mail traffic per user.

Task: We are willing to align the existing implementation with any modern anti-spam solution based on powerfull machine learning implementation (such as apache mahout). We are also willing to extend the machine learning usage to some mail categorization (spam vs not-spam is a first category, we can extend it to any additional category we can imagine). The implementation can partially occur while spooling the mails and/or when mail is stored in mailbox.

Related discussions: See also discussions on mail intelligent mining on http://markmail.org/message/2bodrwvdvtfq3f2v (mahout related) and http://markmail.org/thread/pksl6csyvoeo27yh (hama related).

Mentor: eric at apache dot org & [fill in mentor]

Complexity: high

Attachments

Sub-Tasks

AI Mailets Website

In Progress

Robert Burrell Donkin

Activity

People

Assignee:: Eric Charles

Reporter:: Eric Charles

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 30/Mar/11 06:24

Updated:: 26/Apr/11 01:51