Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13740

Orphan hint file gets created while node is being removed from cluster

    XMLWordPrintableJSON

    Details

    • Severity:
      Low

      Description

      I have found this new issue during my test, whenever node is being removed then hint file for that node gets written and stays inside the hint directory forever. I debugged the code and found that it is due to the race condition between HintsWriteExecutor.java::flush and HintsWriteExecutor.java::closeWriter
      .

      Time t1 Node is down, as a result Hints are being written by HintsWriteExecutor.java::flush
      Time t2 Node is removed from cluster as a result it calls HintsService.java-exciseStore which removes hint files for the node being removed
      Time t3 Mutation stage keeps pumping Hints through HintService.java::write which again calls HintsWriteExecutor.java::flush and new orphan file gets created

      I was writing a new dtest for

      {CASSANDRA-13562, CASSANDRA-13308}

      and that helped me reproduce this new bug. I will submit patch for this new dtest later.

      I also tried following to check how this orphan hint file responds:
      1. I tried nodetool truncatehints <node> but it fails as node is no longer part of the ring
      2. I then tried nodetool truncatehints, that still doesn’t remove hint file because it is not yet included in the dispatchDequeue

      Reproducible steps:
      Please find dTest python file gossip_hang_test.py attached which reproduces this bug.

      Solution:
      This is due to race condition as mentioned above. Since HintsWriteExecutor.java creates thread pool with only 1 worker, so solution becomes little simple. Whenever we HintService.java::excise a host, just store it in-memory, and check for already evicted host inside HintsWriteExecutor.java::flush . If already evicted host is found then ignore hints.

      Jaydeep

        Attachments

        1. 13740-3.0.15.txt
          10 kB
          Jaydeepkumar Chovatia
        2. gossip_hang_test.py
          3 kB
          Jaydeepkumar Chovatia

          Issue Links

            Activity

              People

              • Assignee:
                chovatia.jaydeep@gmail.com Jaydeepkumar Chovatia
                Reporter:
                chovatia.jaydeep@gmail.com Jaydeepkumar Chovatia
                Authors:
                Jaydeepkumar Chovatia
                Reviewers:
                Aleksey Yeschenko
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: