James Mailbox
  1. James Mailbox
  2. MAILBOX-44

[gsoc2011] Design and implement a distributed mailbox using Hadoop


    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4
    • Component/s: None
    • Labels:


      Context: The mailbox subproject (http://james.apache.org/mailbox/) supports maildir, SQL database (via JPA) and Java Content Repository (JCR) as technology for mail storage. This flexibility is achieved thanks to a API design that abstracts mail storage from the mail protocols.

      Task: We need to implement mailbox storage as a distributed system on top of Hadoop HDFS. The James mailbox API will be used. A first step is to design how to interact with Hadoop (native api, gora incubator at apache,...) and deal with specific performance questions related to mail loading/parsing in a distributed system (use map/reduce or not, use existing local lucene indexes for search,...). The second step is to implement the HDFS mailbox (maildir mailbox is similar because is stores mails as a file and can be an inspiration). A single James server will still be deployed because we don't have any distributed UID generation.

      Mentor: eric at apache dot org

      Complexity: medium


        Eric Charles created issue -
        Norman Maurer made changes -
        Field Original Value New Value
        Fix Version/s 0.3 [ 12316446 ]
        Norman Maurer made changes -
        Fix Version/s 0.4 [ 12316646 ]
        Fix Version/s 0.3 [ 12316446 ]
        Eric Charles made changes -
        Status Open [ 1 ] Closed [ 6 ]
        Resolution Fixed [ 1 ]


          • Assignee:
            Norman Maurer
            Eric Charles
          • Votes:
            0 Vote for this issue
            5 Start watching this issue


            • Created: