James Mailbox
  1. James Mailbox
  2. MAILBOX-72

Requirements for a distributed mailbox implementation

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      This JIRA will collect some generic technical requirements regarding a distributed mailbox implementation, whatever the implementation technology is.
      The implementator is responsible to enforce those requirements.

        Issue Links

          Activity

          Hide
          Ioan Eugen Stan added a comment -

          This is a summary of a discution available on the server-dev@james.apache.org mailing list

          For best results in implementing a mailbox storage over a distributed environment (namely HBase/HDFS), the following things must be taken in consideration:

          • mailbox (immutable: create/read/delete/query)
          • message (immutable: create/read/delete/query)
          • message flags (create/read/update/delete/query)
          • subscriptions (create/read/update/delete/query)

          Important things regarding HBase:

          • cells are versioned
          • rows are sorted by row key - very important
          • column families are physically stored in the same place and they should have the same access pattern (just read, or read/write)
          • all column families must be created with the table
          • columns may be added on the fly to column families.

          Some useful tips for choosing keys and column names:

          • you can use reverse domain name to keep things in a proper sorted order (like org.apache@username)
          • you can use reverse order time stamp (Long.MAX_VALUE - epoch) to keep the newest records first (get the latest emails first).
          • use binary data instead of string representation if key is integer numeric value.

          A paper detailing a sample data schema for Cassandra is available in [1].

          Some reading regarding about Big data column stores:

          [1] http://ewh.ieee.org/r6/scv/computer/nfic/2009/IBM-Jun-Rao.pdf
          [2] Hadoop the definitive guide second edition - The HBase chapter.
          [3] http://db.csail.mit.edu/projects/cstore/abadicidr07.pdf
          [4] http://en.wikipedia.org/wiki/Column-oriented_DBMS

          Show
          Ioan Eugen Stan added a comment - This is a summary of a discution available on the server-dev@james.apache.org mailing list For best results in implementing a mailbox storage over a distributed environment (namely HBase/HDFS), the following things must be taken in consideration: mailbox (immutable: create/read/delete/query) message (immutable: create/read/delete/query) message flags (create/read/update/delete/query) subscriptions (create/read/update/delete/query) Important things regarding HBase: cells are versioned rows are sorted by row key - very important column families are physically stored in the same place and they should have the same access pattern (just read, or read/write) all column families must be created with the table columns may be added on the fly to column families. Some useful tips for choosing keys and column names: you can use reverse domain name to keep things in a proper sorted order (like org.apache@username) you can use reverse order time stamp (Long.MAX_VALUE - epoch) to keep the newest records first (get the latest emails first). use binary data instead of string representation if key is integer numeric value. A paper detailing a sample data schema for Cassandra is available in [1] . Some reading regarding about Big data column stores: [1] http://ewh.ieee.org/r6/scv/computer/nfic/2009/IBM-Jun-Rao.pdf [2] Hadoop the definitive guide second edition - The HBase chapter. [3] http://db.csail.mit.edu/projects/cstore/abadicidr07.pdf [4] http://en.wikipedia.org/wiki/Column-oriented_DBMS
          Hide
          Eric Charles added a comment -

          The attached image contains the needed domain classes that must be implemented by the store:

          • Mailbox (with id, namespace, user, name, uidValidity)
          • Subscription (with mailbox, user)
          • Message (with date, mailboxid, uid, flags, fullcontent, bodycontent, mediatype, headers, properties,...)
          • Header (with fieldname, linenumber, value)
          • Property (with namespace, localname, value)
          Show
          Eric Charles added a comment - The attached image contains the needed domain classes that must be implemented by the store: Mailbox (with id, namespace, user, name, uidValidity) Subscription (with mailbox, user) Message (with date, mailboxid, uid, flags, fullcontent, bodycontent, mediatype, headers, properties,...) Header (with fieldname, linenumber, value) Property (with namespace, localname, value)
          Hide
          Eric Charles added a comment -

          To design a datamodel, it is important to realize the kind of queries we will have to support.
          I take here after the queries from the mailbox-jpa (SQL database).

          From AbstractJPAMessage:
          @NamedQueries({
          @NamedQuery(name="findRecentMessagesInMailbox",
          query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.recent = TRUE"),
          @NamedQuery(name="findUnseenMessagesInMailboxOrderByUid",
          query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.seen = FALSE ORDER BY message.uid ASC"),
          @NamedQuery(name="findMessagesInMailbox",
          query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam"),
          @NamedQuery(name="findMessagesInMailboxBetweenUIDs",
          query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam"),
          @NamedQuery(name="findMessagesInMailboxWithUID",
          query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam"),
          @NamedQuery(name="findMessagesInMailboxAfterUID",
          query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam"),
          @NamedQuery(name="findDeletedMessagesInMailbox",
          query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.deleted=TRUE"),
          @NamedQuery(name="findDeletedMessagesInMailboxBetweenUIDs",
          query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam AND message.deleted=TRUE"),
          @NamedQuery(name="findDeletedMessagesInMailboxWithUID",
          query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam AND message.deleted=TRUE"),
          @NamedQuery(name="findDeletedMessagesInMailboxAfterUID",
          query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam AND message.deleted=TRUE"),

          @NamedQuery(name="deleteDeletedMessagesInMailbox",
          query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.deleted=TRUE"),
          @NamedQuery(name="deleteDeletedMessagesInMailboxBetweenUIDs",
          query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam AND message.deleted=TRUE"),
          @NamedQuery(name="deleteDeletedMessagesInMailboxWithUID",
          query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam AND message.deleted=TRUE"),
          @NamedQuery(name="deleteDeletedMessagesInMailboxAfterUID",
          query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam AND message.deleted=TRUE"),

          @NamedQuery(name="countUnseenMessagesInMailbox",
          query="SELECT COUNT(message) FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.seen=FALSE"),
          @NamedQuery(name="countMessagesInMailbox",
          query="SELECT COUNT(message) FROM Message message WHERE message.mailbox.mailboxId = :idParam"),
          @NamedQuery(name="deleteMessages",
          query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam"),
          @NamedQuery(name="findLastUidInMailbox",
          query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam ORDER BY message.uid DESC"),
          @NamedQuery(name="deleteAllMemberships",
          query="DELETE FROM Message message")

          From JPAMailbox
          @NamedQuery(name="findMailboxById",
          query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.mailbox.mailboxId = :idParam"),
          @NamedQuery(name="findMailboxByName",
          query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name = :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"),
          @NamedQuery(name="findMailboxByNameWithUser",
          query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name = :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"),
          @NamedQuery(name="deleteAllMailboxes",
          query="DELETE FROM Mailbox mailbox"),
          @NamedQuery(name="findMailboxWithNameLikeWithUser",
          query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"),
          @NamedQuery(name="findMailboxWithNameLike",
          query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"),
          @NamedQuery(name="countMailboxesWithNameLikeWithUser",
          query="SELECT COUNT(mailbox) FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"),
          @NamedQuery(name="countMailboxesWithNameLike",
          query="SELECT COUNT(mailbox) FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"),
          @NamedQuery(name="listMailboxes",
          query="SELECT mailbox FROM Mailbox mailbox")

          From JPASubscription
          @NamedQueries(

          { @NamedQuery(name = "findFindMailboxSubscriptionForUser", query = "SELECT subscription FROM Subscription subscription WHERE subscription.username = :userParam AND subscription.mailbox = :mailboxParam"), @NamedQuery(name = "findSubscriptionsForUser", query = "SELECT subscription FROM Subscription subscription WHERE subscription.username = :userParam") }

          )

          Show
          Eric Charles added a comment - To design a datamodel, it is important to realize the kind of queries we will have to support. I take here after the queries from the mailbox-jpa (SQL database). From AbstractJPAMessage: @NamedQueries({ @NamedQuery(name="findRecentMessagesInMailbox", query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.recent = TRUE"), @NamedQuery(name="findUnseenMessagesInMailboxOrderByUid", query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.seen = FALSE ORDER BY message.uid ASC"), @NamedQuery(name="findMessagesInMailbox", query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam"), @NamedQuery(name="findMessagesInMailboxBetweenUIDs", query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam"), @NamedQuery(name="findMessagesInMailboxWithUID", query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam"), @NamedQuery(name="findMessagesInMailboxAfterUID", query="SELECT message FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam"), @NamedQuery(name="findDeletedMessagesInMailbox", query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.deleted=TRUE"), @NamedQuery(name="findDeletedMessagesInMailboxBetweenUIDs", query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam AND message.deleted=TRUE"), @NamedQuery(name="findDeletedMessagesInMailboxWithUID", query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam AND message.deleted=TRUE"), @NamedQuery(name="findDeletedMessagesInMailboxAfterUID", query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam AND message.deleted=TRUE"), @NamedQuery(name="deleteDeletedMessagesInMailbox", query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.deleted=TRUE"), @NamedQuery(name="deleteDeletedMessagesInMailboxBetweenUIDs", query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid BETWEEN :fromParam AND :toParam AND message.deleted=TRUE"), @NamedQuery(name="deleteDeletedMessagesInMailboxWithUID", query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid=:uidParam AND message.deleted=TRUE"), @NamedQuery(name="deleteDeletedMessagesInMailboxAfterUID", query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.uid>=:uidParam AND message.deleted=TRUE"), @NamedQuery(name="countUnseenMessagesInMailbox", query="SELECT COUNT(message) FROM Message message WHERE message.mailbox.mailboxId = :idParam AND message.seen=FALSE"), @NamedQuery(name="countMessagesInMailbox", query="SELECT COUNT(message) FROM Message message WHERE message.mailbox.mailboxId = :idParam"), @NamedQuery(name="deleteMessages", query="DELETE FROM Message message WHERE message.mailbox.mailboxId = :idParam"), @NamedQuery(name="findLastUidInMailbox", query="SELECT message.uid FROM Message message WHERE message.mailbox.mailboxId = :idParam ORDER BY message.uid DESC"), @NamedQuery(name="deleteAllMemberships", query="DELETE FROM Message message") From JPAMailbox @NamedQuery(name="findMailboxById", query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.mailbox.mailboxId = :idParam"), @NamedQuery(name="findMailboxByName", query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name = :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"), @NamedQuery(name="findMailboxByNameWithUser", query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name = :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"), @NamedQuery(name="deleteAllMailboxes", query="DELETE FROM Mailbox mailbox"), @NamedQuery(name="findMailboxWithNameLikeWithUser", query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"), @NamedQuery(name="findMailboxWithNameLike", query="SELECT mailbox FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"), @NamedQuery(name="countMailboxesWithNameLikeWithUser", query="SELECT COUNT(mailbox) FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user= :userParam and mailbox.namespace= :namespaceParam"), @NamedQuery(name="countMailboxesWithNameLike", query="SELECT COUNT(mailbox) FROM Mailbox mailbox WHERE mailbox.name LIKE :nameParam and mailbox.user is NULL and mailbox.namespace= :namespaceParam"), @NamedQuery(name="listMailboxes", query="SELECT mailbox FROM Mailbox mailbox") From JPASubscription @NamedQueries( { @NamedQuery(name = "findFindMailboxSubscriptionForUser", query = "SELECT subscription FROM Subscription subscription WHERE subscription.username = :userParam AND subscription.mailbox = :mailboxParam"), @NamedQuery(name = "findSubscriptionsForUser", query = "SELECT subscription FROM Subscription subscription WHERE subscription.username = :userParam") } )
          Hide
          Ioan Eugen Stan added a comment -

          better approach.

          Show
          Ioan Eugen Stan added a comment - better approach.

            People

            • Assignee:
              Norman Maurer
              Reporter:
              Eric Charles
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development