Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13740

Orphan hint file gets created while node is being removed from cluster

    XMLWordPrintableJSON

Details

    • Low

    Description

      I have found this new issue during my test, whenever node is being removed then hint file for that node gets written and stays inside the hint directory forever. I debugged the code and found that it is due to the race condition between HintsWriteExecutor.java::flush and HintsWriteExecutor.java::closeWriter
      .

      Time t1 Node is down, as a result Hints are being written by HintsWriteExecutor.java::flush
      Time t2 Node is removed from cluster as a result it calls HintsService.java-exciseStore which removes hint files for the node being removed
      Time t3 Mutation stage keeps pumping Hints through HintService.java::write which again calls HintsWriteExecutor.java::flush and new orphan file gets created

      I was writing a new dtest for

      {CASSANDRA-13562, CASSANDRA-13308}

      and that helped me reproduce this new bug. I will submit patch for this new dtest later.

      I also tried following to check how this orphan hint file responds:
      1. I tried nodetool truncatehints <node> but it fails as node is no longer part of the ring
      2. I then tried nodetool truncatehints, that still doesn’t remove hint file because it is not yet included in the dispatchDequeue

      Reproducible steps:
      Please find dTest python file gossip_hang_test.py attached which reproduces this bug.

      Solution:
      This is due to race condition as mentioned above. Since HintsWriteExecutor.java creates thread pool with only 1 worker, so solution becomes little simple. Whenever we HintService.java::excise a host, just store it in-memory, and check for already evicted host inside HintsWriteExecutor.java::flush . If already evicted host is found then ignore hints.

      Jaydeep

      Attachments

        1. gossip_hang_test.py
          3 kB
          Jaydeepkumar Chovatia
        2. 13740-3.0.15.txt
          10 kB
          Jaydeepkumar Chovatia

        Issue Links

          Activity

            People

              chovatia.jaydeep@gmail.com Jaydeepkumar Chovatia
              chovatia.jaydeep@gmail.com Jaydeepkumar Chovatia
              Jaydeepkumar Chovatia
              Aleksey Yeschenko
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m