James Server
  1. James Server
  2. JAMES-1216

[gsoc2011] Design and implement machine learning filters and categorization for mail


    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:


      Context: Anti-spam functionality based on SpamAssassin is available at James (base on mailets http://james.apache.org/mailet). Bayesian mailets are also available, but not completely integrated/documented. Nothing is available to automatically categorize mail traffic per user.

      Task: We are willing to align the existing implementation with any modern anti-spam solution based on powerfull machine learning implementation (such as apache mahout). We are also willing to extend the machine learning usage to some mail categorization (spam vs not-spam is a first category, we can extend it to any additional category we can imagine). The implementation can partially occur while spooling the mails and/or when mail is stored in mailbox.

      Related discussions: See also discussions on mail intelligent mining on http://markmail.org/message/2bodrwvdvtfq3f2v (mahout related) and http://markmail.org/thread/pksl6csyvoeo27yh (hama related).

      Mentor: eric at apache dot org & [fill in mentor]

      Complexity: high



          • Assignee:
            Eric Charles
            Eric Charles
          • Votes:
            0 Vote for this issue
            1 Start watching this issue


            • Created: