Uploaded image for project: 'Usergrid (Retired)'
  1. Usergrid (Retired)
  2. USERGRID-1259

Re-indexing ElasticSearch entity data from Cassandra - Possible Memory Leaks in Usergrid

Details

    • Story
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.1.0
    • 2.2.0
    • None
    • None
    • Important
    • Usergrid 36

    Description

      Full system re-index job (http://localhost:8080/system/index/rebuild), seems to stop / hang after 20-30 hours of indexing. Usergrid seems to exhaust the 4.5 GB of RAM. Please see logs below:

      1. UserGrid logs (out of java heap space)
      Feb 04 17:06:32 Usergrid-2 catalina.out: 06:36:31,961 WARN OioServerSocketPipelineSink:83 - Failed to accept a connection.
      Feb 04 17:06:32 Usergrid-2 catalina.out: java.lang.OutOfMemoryError: Java heap space
      Feb 04 15:05:03 Usergrid-2 catalina.out: 04:34:25,166 WARN jvm:203 - [default] [gc][old][29454][2683] duration [54.2s], collections [3]/[54.3s], total [54.2s]/[13.2h], memory [4.4gb]->[4.4gb]/[4.4gb], all_pools

      {[young] [532.5mb]->[532.5mb]/[532.5mb]} {[survivor] [62.5mb]->[65mb]/[66.5mb]} {[old] [3.8gb]->[3.8gb]/[3.8gb]}

      Feb 03 20:38:34 Usergrid-2 catalina.out: 10:08:34,616 ERROR AmazonAsyncEventService:361 - Failed to index message: 886ea9bd-708d-4bea-ab1c-844ff97c947c

      2. ES logs (time out, removed non-data node from cluster)
      Feb 04 16:00:33 Elasticsearch elasticsearch.log: [2016-02-04 05:31:17,243][INFO ][cluster.service ] [Arlette Truffaut] removed {[default][3GuynlamR6GvdiCAhhTBmw][ip-10-0-0-237][inet[/10.0.0.237:9301]]

      {client=true, data=false},}, reason: zen-disco-node_failed([default][3GuynlamR6GvdiCAhhTBmw][ip-10-0-0-237][inet[/10.0.0.237:9301]]{client=true, data=false}

      ), reason failed to ping, tried [3] times, each with maximum [30s] timeout
      Feb 04 16:00:33 Elasticsearch elasticsearch.log: [2016-02-04 05:31:17,256][DEBUG][action.admin.cluster.node.stats] [Arlette Truffaut] failed to execute on node [3GuynlamR6GvdiCAhhTBmw]
      Feb 04 16:00:33 Elasticsearch elasticsearch.log: org.elasticsearch.transport.NodeDisconnectedException: [default][inet[/10.0.0.237:9301]][cluster:monitor/nodes/stats[n]] disconnected

      Is it possible that the re-indexing code in Usergrid could have memory leaks and thus uses up all the java heap memory.
      Please help.

      Thanks
      Jaskaran

      Attachments

        Activity

          People

            mrusso Michael Russo
            jaskaran Jaskaran
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: