Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-6225

GCInspector should not wait after ConcurrentMarkSweep GC to flush memtables and reduce cache size

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Not A Problem
    • None
    • None
    • None
    • Cassandra 1.2.9, SunOS, Java 7

    • Normal

    Description

      In GCInspector.logGCResults, cassandra won't flush memtables and reduce Cache Sizes until there is a ConcurrentMarkSweep GC. It caused a long pause on the service. And other nodes could mark it as DEAD.

      In our stress test, we were using 64 concurrent threads to write data to cassandra. The heap usage grew up quickly and reach to maximum.
      We saw several ConcurrentMarkSweep GCs which only freed very few rams until a memtable flush was called. The other nodes marked the node as DOWN when GC took more than 20 seconds.

      INFO [ScheduledTasks:1] 2013-10-18 15:42:36,176 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 27481 ms for 1 collections, 5229917848 used; max is 6358564864
       INFO [ScheduledTasks:1] 2013-10-18 15:43:14,013 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 27729 ms for 1 collections, 5381504752 used; max is 6358564864
       INFO [ScheduledTasks:1] 2013-10-18 15:43:50,565 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 29867 ms for 1 collections, 5479631256 used; max is 6358564864
       INFO [ScheduledTasks:1] 2013-10-18 15:44:23,457 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 28166 ms for 1 collections, 5545752344 used; max is 6358564864
       INFO [ScheduledTasks:1] 2013-10-18 15:44:58,290 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 29377 ms for 2 collections, 5343255456 used; max is 6358564864
      
      INFO [GossipTasks:1] 2013-10-18 15:42:29,004 Gossiper.java (line 803) InetAddress /1.2.3.4 is now DOWN
       INFO [GossipTasks:1] 2013-10-18 15:43:06,901 Gossiper.java (line 803) InetAddress /1.2.3.4 is now DOWN
       INFO [GossipTasks:1] 2013-10-18 15:44:18,254 Gossiper.java (line 803) InetAddress /1.2.3.4 is now DOWN
       INFO [GossipTasks:1] 2013-10-18 15:44:48,507 Gossiper.java (line 803) InetAddress /1.2.3.4 is now DOWN
       INFO [GossipTasks:1] 2013-10-18 15:45:32,375 Gossiper.java (line 803) InetAddress /1.2.3.4 is now DOWN
      

      We found two solutions to fix the long pause which result in a DOWN status.
      1. We reduced the maximum ram to 3G. The behavior is the same, but gc was faster(under 20 seconds), so no nodes were marked as DOWN

      2. Running a cronjob on the cassandra server which period call nodetool -h localhost flush.

      Flush after a full gc just make thing worse and waste time spent on GC. In a heavily load system, you would have several full GCs before a flush can finish. (a flush may take more than 30 seconds)

      Ideally, GCInspector should has a better logic on when to flush memtable.
      1. Flush memtable/reduce cache size when it reached the threshold(smaller than full gc threshold).
      2. prevent frequently flush by remembering the last running time.

      If we call flush before a full gc, then the full gc will release those rams occupied by memtable. Thus reduce the heap usage a lot. Otherwise, full gc will be called again and again until a flush was finished.

      Attachments

        1. dse_systemlog
          9 kB
          Rekha Joshi

        Activity

          People

            Unassigned Unassigned
            billowgao Billow Gao
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: