Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-18218

OOM while running ShortAccordSimulationTest

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • NA
    • Accord, Test/burn

    Description

      ShortAccordSimulationTest seems to consistently run out of heap when run locally. This is true even when we scale the number of threads and duration down...

      AccordSimulationRunner.main(new String[] { "run", "-n", "3", "-t", "10", "--cluster-action-limit", "-1", "-c", "2", "-s", "10"});
      
      ERROR [CommandStore[0]:1] node1 CS:[0] OP:0x8fb4f8f1 2023-01-31 17:16:27,143 Operation AsyncOperation{RUNNING}-0x8fb4f8f1 failed
      java.lang.OutOfMemoryError: Java heap space
         at java.util.Arrays.copyOf(Arrays.java:3181)
         at accord.local.Command$NotifyWaitingOn.push(Command.java:794)
         at accord.local.Command$NotifyWaitingOn.accept(Command.java:757)
         at accord.local.Command.maybeExecute(Command.java:600)
         at accord.local.Command.onChange(Command.java:546)
         at org.apache.cassandra.service.accord.ListenerProxy$CommandListenerProxy.lambda$onChange$0(ListenerProxy.java:148)
         at org.apache.cassandra.service.accord.ListenerProxy$CommandListenerProxy$$Lambda$4606/1457728285.accept(Unknown Source)
         at org.apache.cassandra.service.accord.async.AsyncOperation$ForConsumer.apply(AsyncOperation.java:261)
         at org.apache.cassandra.service.accord.async.AsyncOperation$ForConsumer.apply(AsyncOperation.java:248)
         at org.apache.cassandra.service.accord.async.AsyncOperation.runInternal(AsyncOperation.java:154)
         at org.apache.cassandra.service.accord.async.AsyncOperation.run(AsyncOperation.java:194)
         at org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
         at org.apache.cassandra.concurrent.SyncFutureTask.run(SyncFutureTask.java:68)
         at org.apache.cassandra.simulator.systems.InterceptingExecutor$AbstractSingleThreadedExecutorPlus.lambda$new$0(InterceptingExecutor.java:584)
         at org.apache.cassandra.simulator.systems.InterceptingExecutor$AbstractSingleThreadedExecutorPlus$$Lambda$768/827906088.run(Unknown Source)
         at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
         at java.lang.Thread.run(Thread.java:748)
      

      JVM args make it seem like we’re passing both a 1 GiB and 8 GiB heap size, although that doesn’t seem to have any bearing on the result. Setting only 8 GiB just takes longer to hit the same problem.

      INFO  [isolatedExecutor:2] node1 2023-01-31 17:15:56,940 JVM Arguments: [-Dstorage-config=/Users/maedhroz/Forks/cassandra/test/conf, -Djava.awt.headless=true, -javaagent:/Users/maedhroz/Forks/cassandra/lib/jamm-0.3.2.jar, -ea, -Djava.io.tmpdir=/var/folders/4d/zfjs7m7s6x5_l93k33r5k6680000gn/T/, -Dcassandra.debugrefcount=true, -Xss384k, -XX:SoftRefLRUPolicyMSPerMB=0, -XX:HeapDumpPath=build/test, -Dcassandra.test.driver.connection_timeout_ms=10000, -Dcassandra.test.driver.read_timeout_ms=24000, -Dcassandra.memtable_row_overhead_computation_step=100, -Dcassandra.test.use_prepared=true, -Dcassandra.test.sstableformatdevelopment=true, -Djava.security.egd=file:/dev/urandom, -Dcassandra.testtag=, -Dcassandra.keepBriefBrief=${cassandra.keepBriefBrief}, -Dcassandra.strict.runtime.checks=true, -Dcassandra.reads.thresholds.coordinator.defensive_checks_enabled=true, -DQT_SHRINKS=0, -Dlogback.configurationFile=test/conf/logback-simulator.xml, -Dcassandra.ring_delay_ms=10000, -Dcassandra.tolerate_sstable_size=true, -Dcassandra.skip_sync=true, -Dcassandra.debugrefcount=false, -Dcassandra.test.simulator.determinismcheck=strict, -Dcassandra.test.simulator.print_asm=none, -javaagent:/Users/maedhroz/Forks/cassandra/build/test/lib/jars/simulator-asm.jar, -Xbootclasspath/a:/Users/maedhroz/Forks/cassandra/build/test/lib/jars/simulator-bootstrap.jar, -XX:ActiveProcessorCount=4, -XX:-TieredCompilation, -XX:-BackgroundCompilation, -XX:CICompilerCount=1, -XX:Tier4CompileThreshold=1000, -XX:ReservedCodeCacheSize=256M, -Xmx8G, -Xmx1024m]
      

      Attachments

        1. java_pid73497.hprof.zip
          23.77 MB
          Caleb Rackliffe
        2. java_pid75762.hprof.zip
          34.26 MB
          Caleb Rackliffe

        Issue Links

          Activity

            People

              maedhroz Caleb Rackliffe
              maedhroz Caleb Rackliffe
              Benedict Elliott Smith
              David Capwell
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: