Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-10593

Unintended interactions between commitlog archiving and commitlog recycling

    XMLWordPrintableJSON

Details

    • Normal

    Description

      Currently the comments in commitlog_archiving.properties suggest using either cp or ln for the archive_command.

      Using ln is problematic because commitlog recycling marks segments as recycled once the corresponding memtables are flushed and Cassandra will no longer replay them. This means it's only possible to do PITR on any records that were written since the last flush.

      Using cp works, and this is currently how OpsCenter does for PITR, however brandon.williams has pointed out this could have some performance impact because of the additional I/O overhead of copying the commitlog segments.

      Starting in 2.1, we can disable commit log recycling in cassandra.yaml so I thought this would allow me to do PITR without the extra overhead of using cp. However, when I disable commitlog recycling and try to do a PITR, Cassandra blows up when trying to replay the restored commit logs:

      ERROR 16:56:42  Exception encountered during startup
      java.lang.IllegalStateException: Cannot safely construct descriptor for segment, as name and header descriptors do not match ((4,1445878452545) vs (4,1445876822565)): /opt/dse/backup/CommitLog-4-1445876822565.log
      	at org.apache.cassandra.db.commitlog.CommitLogArchiver.maybeRestoreArchive(CommitLogArchiver.java:207) ~[cassandra-all-2.1.9.791.jar:2.1.9.791]
      	at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:116) ~[cassandra-all-2.1.9.791.jar:2.1.9.791]
      	at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:352) ~[cassandra-all-2.1.9.791.jar:2.1.9.791]
      	at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:335) ~[dse-core-4.8.0.jar:4.8.0]
      	at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:537) ~[cassandra-all-2.1.9.791.jar:2.1.9.791]
      	at com.datastax.bdp.DseModule.main(DseModule.java:75) [dse-core-4.8.0.jar:4.8.0]
      java.lang.IllegalStateException: Cannot safely construct descriptor for segment, as name and header descriptors do not match ((4,1445878452545) vs (4,1445876822565)): /opt/dse/backup/CommitLog-4-1445876822565.log
      	at org.apache.cassandra.db.commitlog.CommitLogArchiver.maybeRestoreArchive(CommitLogArchiver.java:207)
      	at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:116)
      	at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:352)
      	at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:335)
      	at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:537)
      	at com.datastax.bdp.DseModule.main(DseModule.java:75)
      Exception encountered during startup: Cannot safely construct descriptor for segment, as name and header descriptors do not match ((4,1445878452545) vs (4,1445876822565)): /opt/dse/backup/CommitLog-4-1445876822565.log
      INFO  16:56:42  DSE shutting down...
      INFO  16:56:42  All plugins are stopped.
      ERROR 16:56:42  Exception in thread Thread[Thread-2,5,main]
      java.lang.AssertionError: null
      	at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1403) ~[cassandra-all-2.1.9.791.jar:2.1.9.791]
      	at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:196) ~[dse-core-4.8.0.jar:4.8.0]
      	at com.datastax.bdp.server.DseDaemon.preStop(DseDaemon.java:426) ~[dse-core-4.8.0.jar:4.8.0]
      	at com.datastax.bdp.server.DseDaemon.safeStop(DseDaemon.java:436) ~[dse-core-4.8.0.jar:4.8.0]
      	at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:676) ~[dse-core-4.8.0.jar:4.8.0]
      	at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_31]
      

      For the sake of completeness, I also tested using cp for the archive_command and commitlog recycling disabled, and PITR works as expected, but this of course defeats the point.

      It would be good to have some guidance on what is supported here. If ln isn't expected to work at all, it shouldn't be documented as an acceptable option for the archive_command in commitlog_archiving.properties. If it should work with commitlog recycling disabled, the bug causing the IllegalStateException needs to be fixed.

      It would also be good to do some testing and quantify the performance impact of enabling commitlog archiving using cp as the archve_command.

      I realize there are several different issues described here, so maybe they should be separate JIRAs, but first I wanted to just clarify whether we want to support ln at all, and we can go from there.

      Attachments

        1. system.log
          120 kB
          Ariel Weisberg
        2. commitlog_archiving.properties
          2 kB
          Ariel Weisberg
        3. cassandra.yaml
          37 kB
          Ariel Weisberg

        Issue Links

          Activity

            People

              aweisberg Ariel Weisberg
              jblangston@datastax.com J.B. Langston
              Ariel Weisberg
              Branimir Lambov
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: