James Imap
  1. James Imap
  2. IMAP-373

Caching for Mailbox/Message Mappers

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Mailbox
    • Labels:
      None

      Description

      Because of reasons similar to IMAP-371 the performance of James Imap suffers when there are many clients.

      The core issue is that the Mailbox/Message Mappers methods implementations can be potentially quite expensive, yet in many cases their result is "constant" for the input arguments, when the underlying data does not change. This makes them a good candidate for caching.
      Another issue is that some of the methods are possibly needlessly called mutiple times inside a single request, and this is in focus of IMAP-371.

      The cache can be implemented for example using Decorator pattern on top of the existing Mailbox/Message Mappers.

      In the first step caching may involve MailboxMapper.findMailboxByPath, next steps may be some simple yet potentially expensive aspects of MessageMapper like countMessagesInMailbox(), countUnseenMessagesInMailbox(), getLastUid(), getHighestModSeq()

      Because we have notifications that are fired when something happens to Mailboxes and Messages, it should be quite easy to invalidate stale cache entries. The invalidation can be registered as a global listener the AbstractDelegatingMailboxListener.

      So: the general caching Mailbox/Message Mappers implementation would be abstract, and there needs to be a concrete implementation too.

      For concrete implementation it would be nice to use something like guava Cache built with CacheBuilder with some nice options like proper TTL, size etc. The guestion is whether guava is license-compatible with James.

      What do you think about it?

      1. IMAP-373-v1.patch
        25 kB
        Andrzej Rusin

        Activity

        Hide
        Ioan Eugen Stan added a comment -

        +1 totally agree, especially with the use of guava. It's also Apache Licensed so there are no issues here [1].

        [1] http://code.google.com/p/guava-libraries/source/browse/COPYING

        Show
        Ioan Eugen Stan added a comment - +1 totally agree, especially with the use of guava. It's also Apache Licensed so there are no issues here [1] . [1] http://code.google.com/p/guava-libraries/source/browse/COPYING
        Hide
        Eric Charles added a comment -

        We now only support a single instance of the server, but going distributed, we will need to (re)think how the events are propagated, but also how this new cache are maintained distributely.

        Let's start with a single instance implementation. The interface contract will allow to have various implementations indeed. Take care also about how you inject those implementations (for now, we inject with spring).

        Thx, Eric

        Show
        Eric Charles added a comment - We now only support a single instance of the server, but going distributed, we will need to (re)think how the events are propagated, but also how this new cache are maintained distributely. Let's start with a single instance implementation. The interface contract will allow to have various implementations indeed. Take care also about how you inject those implementations (for now, we inject with spring). Thx, Eric
        Hide
        Ioan Eugen Stan added a comment -

        Good points Eric, I think we can use memcached for that as suggested in the Guava docs. Let's worry it that when we have that issue.

        Show
        Ioan Eugen Stan added a comment - Good points Eric, I think we can use memcached for that as suggested in the Guava docs. Let's worry it that when we have that issue.
        Hide
        Andrzej Rusin added a comment -

        Internally we are testing multiple server instances, based on a broadcasting version of AbstractDelegaintgMailboxListener on top of a multicast notifications solution. That may take care of distriubuted caches invalidation too.
        On the other hand, in my opinion (guesstimate) memcached network overhead/latency may be too big here. Of course I may be wrong.

        I already started coding. I may have a proof of concept implementation soon.

        Show
        Andrzej Rusin added a comment - Internally we are testing multiple server instances, based on a broadcasting version of AbstractDelegaintgMailboxListener on top of a multicast notifications solution. That may take care of distriubuted caches invalidation too. On the other hand, in my opinion (guesstimate) memcached network overhead/latency may be too big here. Of course I may be wrong. I already started coding. I may have a proof of concept implementation soon.
        Hide
        Eric Charles added a comment -

        Awesome Andrzej.

        As distributed cache, Hazelcast is my favorite.
        Did you implement your own distributed cache on top of lower-level multicast communication of do you use an existing library for this?

        Thx again for all your goodies,

        Eric

        Show
        Eric Charles added a comment - Awesome Andrzej. As distributed cache, Hazelcast is my favorite. Did you implement your own distributed cache on top of lower-level multicast communication of do you use an existing library for this? Thx again for all your goodies, Eric
        Hide
        Andrzej Rusin added a comment -

        Eric,

        We use jgroups, solely for James Events for MailboxListener propagation between cluster members (dedicated IMAP and SMTP jameses and their "store backends").
        Additionally, as a quick hack to partially overcome IMAP-371 I implemented in the past a naive memcached caching of mailboxes, with "manual" invalidation from respective code across my "store" (which is used by a webmail solution and other things at the same time too). But that caching is nothing like any elegant solution, hence my work on IMAP-373.
        BTW, we were trying to use Hazelcast for Events propagation, but we ran into "not explainable" "issues" with it (under load), that's why we switched to jgroups with a possibly better result.

        Show
        Andrzej Rusin added a comment - Eric, We use jgroups, solely for James Events for MailboxListener propagation between cluster members (dedicated IMAP and SMTP jameses and their "store backends"). Additionally, as a quick hack to partially overcome IMAP-371 I implemented in the past a naive memcached caching of mailboxes, with "manual" invalidation from respective code across my "store" (which is used by a webmail solution and other things at the same time too). But that caching is nothing like any elegant solution, hence my work on IMAP-373 . BTW, we were trying to use Hazelcast for Events propagation, but we ran into "not explainable" "issues" with it (under load), that's why we switched to jgroups with a possibly better result.
        Hide
        Eric Charles added a comment -

        Thx for the details Andrzej.
        I am sure you are aware of this, but worth to say it: we are always very happy to review and welcome contributions such as distributed event propagation via jgroups.

        JGroups is indeed very suited to implement multicast communication. Zeromq is another option to my knowledge, but put extra requirement on native code (c library) to be installed on the host.

        Hazelcast is normally used in large deployments, I guess it should be possible to make it work as wanted.

        Thx, Eric

        Show
        Eric Charles added a comment - Thx for the details Andrzej. I am sure you are aware of this, but worth to say it: we are always very happy to review and welcome contributions such as distributed event propagation via jgroups. JGroups is indeed very suited to implement multicast communication. Zeromq is another option to my knowledge, but put extra requirement on native code (c library) to be installed on the host. Hazelcast is normally used in large deployments, I guess it should be possible to make it work as wanted. Thx, Eric
        Hide
        Andrzej Rusin added a comment -

        Eric,

        I have a first version of cache implementation for review. Should I attach a patch/zip, or commit it to SVN?

        The place I think it can go is /mailbox/caching/src/main/java, package being org.apache.mailbox.caching for the common part, and org.apache.mailbox.caching.guava for the guava-specific part (should it have a separate directory too?).

        Please let me know.

        Show
        Andrzej Rusin added a comment - Eric, I have a first version of cache implementation for review. Should I attach a patch/zip, or commit it to SVN? The place I think it can go is /mailbox/caching/src/main/java, package being org.apache.mailbox.caching for the common part, and org.apache.mailbox.caching.guava for the guava-specific part (should it have a separate directory too?). Please let me know.
        Hide
        Eric Charles added a comment -

        I will take a few days to have your account setup and operational.
        So best to attach the patch here.
        Thx, Eric

        Show
        Eric Charles added a comment - I will take a few days to have your account setup and operational. So best to attach the patch here. Thx, Eric
        Hide
        Andrzej Rusin added a comment -

        Initial version for review if it makes any sense.

        Show
        Andrzej Rusin added a comment - Initial version for review if it makes any sense.
        Hide
        Eric Charles added a comment -

        Hi Andzej,
        I have taken your patch, made a few changes (maven pom + compilation issues) and have committed in trunk.

        The main impact for now is that I had to make the MailboxSessionMapperFactory.create* abstract methods public
        + public MailboxMapper<Integer> createMailboxMapper(MailboxSession session)
        + public MessageMapper<Integer> createMessageMapper(MailboxSession session)
        + public SubscriptionMapper createSubscriptionMapper(Mai

        This is not ideal, but not a big deal IMHO.

        I have put some TODO(eric) for the 3 other ones (stub implementation to make it compile, but need to be further implemented/fixed).
        Type mismatch: cannot convert from CachingMessageMapper<Id> to MailboxMapper<Id> CachingMailboxSessionMapperFactory.java /apache-james-mailbox-caching/src/main/java/org/apache/mailbox/caching line 33 Java Problem
        The constructor CachingMessageMapper<Id>(MessageMapper<Id>, MailboxByPathCache<Id>) is undefined CachingMailboxSessionMapperFactory.java /apache-james-mailbox-caching/src/main/java/org/apache/mailbox/caching line 27 Java Problem
        The constructor CachingMessageMapper<Id>(MailboxMapper<Id>, MailboxMetadataCache<Id>) is undefined CachingMailboxSessionMapperFactory.java /apache-james-mailbox-caching/src/main/java/org/apache/mailbox/caching line 33 Java Problem

        If you svn up, you will be able to further work on that module.

        About the API and strategy, I wonder why we need those additional MailboxByPathCache/MailboxMetadataCache interfaces.
        Wouldn't the cache be just a Decorator around the existing API?

        Thx again, Eric

        Show
        Eric Charles added a comment - Hi Andzej, I have taken your patch, made a few changes (maven pom + compilation issues) and have committed in trunk. The main impact for now is that I had to make the MailboxSessionMapperFactory.create* abstract methods public + public MailboxMapper<Integer> createMailboxMapper(MailboxSession session) + public MessageMapper<Integer> createMessageMapper(MailboxSession session) + public SubscriptionMapper createSubscriptionMapper(Mai This is not ideal, but not a big deal IMHO. I have put some TODO(eric) for the 3 other ones (stub implementation to make it compile, but need to be further implemented/fixed). Type mismatch: cannot convert from CachingMessageMapper<Id> to MailboxMapper<Id> CachingMailboxSessionMapperFactory.java /apache-james-mailbox-caching/src/main/java/org/apache/mailbox/caching line 33 Java Problem The constructor CachingMessageMapper<Id>(MessageMapper<Id>, MailboxByPathCache<Id>) is undefined CachingMailboxSessionMapperFactory.java /apache-james-mailbox-caching/src/main/java/org/apache/mailbox/caching line 27 Java Problem The constructor CachingMessageMapper<Id>(MailboxMapper<Id>, MailboxMetadataCache<Id>) is undefined CachingMailboxSessionMapperFactory.java /apache-james-mailbox-caching/src/main/java/org/apache/mailbox/caching line 33 Java Problem If you svn up, you will be able to further work on that module. About the API and strategy, I wonder why we need those additional MailboxByPathCache/MailboxMetadataCache interfaces. Wouldn't the cache be just a Decorator around the existing API? Thx again, Eric
        Hide
        Andrzej Rusin added a comment -

        Eric,

        Thank you for committing it.

        The compilation problems can be quickly resolved and I will do that.

        API strategy: Those additional MailboxByPathCache/MailboxMetadataCache interfaces were intended just to make the CachingMessageMapper and CachingMailboxMapper clearly separate from the underlying caches. I believe the interfaces also convey the purpose of it.
        Other option would be to make the current CachingMessageMapper and CachingMailboxMapper abstract on the cacheable methods, and let descendants decide what to do there. But it does not introduce substantial improvement.
        If you have a better idea for that, let me know. I am open to any suggestions.

        Regards,
        Andrzej

        Show
        Andrzej Rusin added a comment - Eric, Thank you for committing it. The compilation problems can be quickly resolved and I will do that. API strategy: Those additional MailboxByPathCache/MailboxMetadataCache interfaces were intended just to make the CachingMessageMapper and CachingMailboxMapper clearly separate from the underlying caches. I believe the interfaces also convey the purpose of it. Other option would be to make the current CachingMessageMapper and CachingMailboxMapper abstract on the cacheable methods, and let descendants decide what to do there. But it does not introduce substantial improvement. If you have a better idea for that, let me know. I am open to any suggestions. Regards, Andrzej
        Hide
        Andrzej Rusin added a comment -

        Eric,

        Looks like you deleted all files from SVN just after you added them...

        Show
        Andrzej Rusin added a comment - Eric, Looks like you deleted all files from SVN just after you added them...
        Hide
        Eric Charles added a comment -

        Hi Andrzej,
        Nope, they are still there: https://svn.apache.org/repos/asf/james/mailbox/trunk/caching/
        I just deleted the target folder.

        Thx, Eric

        Show
        Eric Charles added a comment - Hi Andrzej, Nope, they are still there: https://svn.apache.org/repos/asf/james/mailbox/trunk/caching/ I just deleted the target folder. Thx, Eric
        Show
        Andrzej Rusin added a comment - This folder is empty: https://svn.apache.org/repos/asf/james/mailbox/trunk/caching/src/main/java/org/apache/mailbox/caching/
        Hide
        Eric Charles added a comment -

        Source is back in svn. Sorry for the mess.
        For the MailboxByPathCache/MailboxMetadataCache, I favor composition vs inheritance, so that's fine to have those additional interfaces.
        Thx, Eric

        Show
        Eric Charles added a comment - Source is back in svn. Sorry for the mess. For the MailboxByPathCache/MailboxMetadataCache, I favor composition vs inheritance, so that's fine to have those additional interfaces. Thx, Eric
        Hide
        Andrzej Rusin added a comment -

        Eric,
        I fixed the compilation problems. I will soon start testing the code on my test servers.

        Show
        Andrzej Rusin added a comment - Eric, I fixed the compilation problems. I will soon start testing the code on my test servers.

          People

          • Assignee:
            Eric Charles
            Reporter:
            Andrzej Rusin
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development