Uploaded image for project: 'James Server'
  1. James Server
  2. JAMES-4041

OOM upon IMAP copy

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • master, 3.7.5, 3.8.2
    • master, 3.8.3
    • IMAPServer, mailbox
    • None

    Description

      I encountered this on production:

      "java.lang.OutOfMemoryError: Java heap space\n\t
      at java.base/java.util.Arrays.copyOf(Unknown Source)\n\t
      at java.base/java.util.ArrayList.grow(Unknown Source)\n\t
      at java.base/java.util.ArrayList.grow(Unknown Source)\n\t
      at java.base/java.util.ArrayList.add(Unknown Source)\n\t
      at java.base/java.util.ArrayList.add(Unknown Source)\n\t
      at org.apache.james.mailbox.model.MessageRange.split(MessageRange.java:247)\n\t
      at org.apache.james.mailbox.store.MessageBatcher.batchMessagesReactive(MessageBatcher.java:70)\n\t
      at org.apache.james.mailbox.store.StoreMailboxManager.lambda$copyMessagesReactive$48(StoreMailboxManager.java:713)\n\t
      at org.apache.james.mailbox.store.StoreMailboxManager$$Lambda/0x00007f12613caab8.apply(Unknown Source)\n\t
      at reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain.onNext(MonoFlatMapMany.java:163)\n\t
      at reactor.core.publisher.MonoZip$ZipCoordinator.signal(MonoZip.java:297)\n\t
      at reactor.core.publisher.MonoZip$ZipInner.onNext(MonoZip.java:478)\n\t
      at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122)\n\t
      at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:74)\n\t
      at reactor.core.publisher.MonoZip$ZipCoordinator.signal(MonoZip.java:297)\n\t
      at reactor.core.publisher.MonoZip$ZipInner.onNext(MonoZip.java:478)\n\t
      at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245)\n\t
      at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305)\n\t
      at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)\n\t
      at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2571)\n\t
      at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:171)\n\t
      at reactor.core.publisher.MonoFlatMap$FlatMapInner.onSubscribe(MonoFlatMap.java:291)\n\t
      at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:96)\n\t
      at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55)\n\t
      at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:76)\n\t
      at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)\n\t
      at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)\n\t
      at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122)\n\t
      at reactor.core.publisher.MonoPublishOn$PublishOnSubscriber.run(MonoPublishOn.java:181)\n\t
      at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)\n\t
      at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)\n\t
      at java.base/java.util.concurrent.FutureTask.run(Unknown Source)\n"
      

      Was able to reproduce: CF screenshot

      This was actually encountered with the following batchSizes:

      copy=10
      move=10
      

      And increasing aggressively the batch size was actually usefull as a work around:

      copy=2000000000
      move=2000000000
      

      However I fear this means the overall batching process for MOVE and COPY makes little sense...

      I do think this could be handle in a pure reactive way:

      • Fetch all the messages in the range
      • window them using the batch size
      • perform the update one window at a time
      • and finally aggregate the resulting MessageRange

      I will try to get a shot at it later this week.

      BTW do my great unpleasure it was not possible to disable batching...

      Caused by: java.lang.IllegalArgumentException: 'copyBatchSize' must be greater than zero
      	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
      	at org.apache.james.mailbox.store.BatchSizes$Builder.copyBatchSize(BatchSizes.java:86)
      	at org.apache.james.modules.mailbox.CassandraSessionModule.getBatchSizesConfiguration(CassandraSessionModule.java:109)
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            btellier Benoit Tellier
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 50m
                50m