Uploaded image for project: 'Usergrid (Retired)'
  1. Usergrid (Retired)
  2. USERGRID-1346

Tomcat - out of memory exceptions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • Stack
    • None
    • Environment: Ubuntu 14.04, Tomcat 7, JDK 1.8.0_65 (Oracle);
      Cassandra version: 2.2.6 (DataStax);
      Usergrid version: 2.2.0 (Master branch, 3rd May, 2016)

    Description

      Hello Usergrid Team,

      We are suddenly facing "out of memory" exceptions in our Tomcat Severs, under low load conditions. Please note, our usergrid installations have been very stable over the last 6 months, and we have "not" seen such issues before. I am pasting a few logs that have suddenly started showing up.

      ------------------------------------------------------------
      Nov 09 16:15:26 catalina.out: 05:45:26,812 WARN EntityMappingParser:116 - Encountered 2 collections consecutively. N+1 dimensional arrays are unsupported, only arrays of depth 1 are supported
      ------------------------------------------------------------
      Nov 09 17:22:12 catalina.out: 06:52:12,848 WARN AsyncEventServiceImpl:362 - No index operation messages came back from event processing for msg:
      ------------------------------------------------------------
      Nov 09 17:39:56 catalina.out: 07:09:56,177 INFO transport:470 - [ip-10-0-2-128] failed to get local cluster state for transport#-3[ip-10-0-2-128][inet[/10.0.4.205:9300]], disconnecting...
      Nov 09 17:39:56 catalina.out: org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[/10.0.4.205:9300]][cluster:monitor/state] request_id [11652] timed out after [5247ms]
      Nov 09 17:39:56 catalina.out: at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529)
      Nov 09 17:39:56 catalina.out: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      Nov 09 17:39:56 catalina.out: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      Nov 09 17:39:56 catalina.out: at java.lang.Thread.run(Thread.java:745)
      ------------------------------------------------------------
      Nov 09 17:40:17 catalina.out: 07:10:17,557 WARN transport:415 - [ip-10-0-2-128] Received response for a request that has timed out, sent [10887ms] ago, timed out [3ms] ago, action [cluster:monitor/state], node [[bluedls__us-east-1a__db__10.0.4.63][T6OWiR1US9m5ABxHh0tW0w][ip-10-0-4-63][inet[/10.0.4.63:9300]]

      {zone=us-east-1__us-east-1a}

      ], id [11678]
      ------------------------------------------------------------
      Nov 09 17:43:05 catalina.out: 07:13:05,091 ERROR AbstractExceptionMapper:74 - com.netflix.hystrix.exception.HystrixRuntimeException 5XX Uncaught Exception (500)
      Nov 09 17:43:05 catalina.out: com.netflix.hystrix.exception.HystrixRuntimeException: ConsistentReplayCommand timed-out and fallback failed.
      ..
      Nov 09 17:43:05 catalina.out: Caused by: java.util.concurrent.TimeoutException
      ..
      Nov 09 17:43:05 catalina.out: Caused by: rx.exceptions.OnErrorThrowable$OnNextValue: OnError while emitting onNext value: org.apache.usergrid.persistence.collection.mvcc.stage.CollectionIoEvent.class
      ..
      Nov 09 17:43:05 catalina.out: 07:13:05,123 ERROR AbstractExceptionMapper:108 - Server Error (500):
      Nov 09 17:43:05 catalina.out:

      {"error":"hystrix_runtime","timestamp":1510229585122,"duration":0,"error_description":"ConsistentReplayCommand timed-out and fallback failed.","exception":"com.netflix.hystrix.exception.HystrixRuntimeException"}

      ------------------------------------------------------------

      Our monitoring indicates there is no issue in cassandra and elasticseach clusters. Look forward to your help.

      Thanks

      Attachments

        Activity

          People

            Unassigned Unassigned
            jaskaran Jaskaran
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: