Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13696

Digest mismatch Exception if hints file has UnknownColumnFamily

    XMLWordPrintableJSON

Details

    • Availability - Cluster Crash
    • Critical

    Description

      WARN  [HintsDispatcher:2] 2017-07-16 22:00:32,579 HintsReader.java:235 - Failed to read a hint for /127.0.0.2: a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0 - table with id 3882bbb0-6a71-11e7-9bca-2759083e3964 is unknown in file a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints
      ERROR [HintsDispatcher:2] 2017-07-16 22:00:32,580 HintsDispatchExecutor.java:234 - Failed to dispatch hints file a2b7daf1-a6a4-4dfc-89de-32d12d2d48b0-1500242103097-1.hints: file is corrupted ({})
      org.apache.cassandra.io.FSReadError: java.io.IOException: Digest mismatch exception
          at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:199) ~[main/:na]
          at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:164) ~[main/:na]
          at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[main/:na]
          at org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:157) ~[main/:na]
          at org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:139) ~[main/:na]
          at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:123) ~[main/:na]
          at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:95) ~[main/:na]
          at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:268) [main/:na]
          at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:251) [main/:na]
          at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:229) [main/:na]
          at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:208) [main/:na]
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_111]
          at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_111]
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111]
          at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) [main/:na]
          at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
      Caused by: java.io.IOException: Digest mismatch exception
          at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNextInternal(HintsReader.java:216) ~[main/:na]
          at org.apache.cassandra.hints.HintsReader$HintsIterator.computeNext(HintsReader.java:190) ~[main/:na]
          ... 16 common frames omitted
      

      It causes multiple cassandra nodes stop by default.

      Here is the reproduce steps on a 3 nodes cluster, RF=3:
      1. stop node1
      2. send some data with quorum (or one), it will generate hints file on node2/node3
      3. drop the table
      4. start node1

      node2/node3 will report "corrupted hints file" and stop. The impact is very bad for a large cluster, when it happens, almost all the nodes are down at the same time and we have to remove all the hints files (which contain the dropped table) to bring the node back.

      Attachments

        Activity

          People

            jay.zhuang Jay Zhuang
            jay.zhuang Jay Zhuang
            Jay Zhuang
            Jeff Jirsa
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: