Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-1967

commit log replay shouldn't end with a flush

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Not A Problem
    • None
    • None
    • None

    Description

      (Apologies in advance if there is some very compelling reason to flush after replay, of which I am not currently aware. ;D)

      Currently, when a node restarts, the following sequence occurs :

      a) commitlog is replayed
      b) any memtables resulting from a) are flushed
      c) a new commitlog is opened, new memtables are switched in
      ... (other stuff happens)
      d) node starts taking traffic

      This has side effects, perhaps most seriously the potential of triggering compaction. As a node is likely to struggle performance-wise after restarting, triggering compaction at that time seems like something we might wish to avoid.

      I propose that the sequence be :

      a) commitlog is replayed
      b) a new commitlog is opened, new memtables are switched in
      ... (other stuff happens)
      c) node starts taking traffic

      Looking through the relevant code, the only code that appears to depend on this flush is at src/java/org/apache/cassandra/db/commitlog/CommitLog.java:112 :
      "
      // all old segments are recovered and deleted before CommitLog is instantiated.
      // All we need to do is create a new one.
      segments.add(new CommitLogSegment());
      "

      Presumably this code would have to be refactored to be aware of the currently open commitlog.

      Attachments

        Activity

          People

            Unassigned Unassigned
            rcoli Robert Coli
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: