Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-4456

AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 1.1.3
    • None
    • None
    • Ubuntu 11.04 64-bit

    • Normal

    Description

      We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.

      ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
      java.lang.AssertionError
      at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
      at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
      at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
      at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
      at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
      at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      at java.lang.Thread.run(Thread.java:662)

      In building this ring we migrated sstables from an identical 0.8.8 ring by:

      1. Creating the schema on our new 1.1.2 ring.
      2. Rsyncing over sstables from 0.8.8 ring.
      3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
      4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
      5. Ran nodetool upgradesstables for each CF across each node.

      When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception – some of the repairs have completed successfully.

      Attachments

        1. 4456.txt
          2 kB
          Sylvain Lebresne

        Activity

          People

            slebresne Sylvain Lebresne
            mheffner Mike Heffner
            Sylvain Lebresne
            Jonathan Ellis
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: