Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-19336

Repair causes out of memory

    XMLWordPrintableJSON

Details

    Description

      CASSANDRA-14096 introduced repair_session_space as a limit for the memory usage for Merkle tree calculations during repairs. This limit is applied to the set of Merkle trees built for a received validation request (VALIDATION_REQ), divided by the replication factor so as not to overwhelm the repair coordinator, who will have requested RF sets of Merkle trees. That way the repair coordinator should only use repair_session_space for the RF Merkle trees.

      However, a repair session without pr/-partitioner-range will send RF*RF validation requests, because the repair coordinator node has RF-1 replicas and is also the replica of RF-1 nodes. Since all the requests are sent at the same time, at some point the repair coordinator can have up to RF*repair_session_space worth of Merkle trees if none of the validation responses is fully processed before the last response arrives.

      Even worse, if the cluster uses virtual nodes, many nodes can be replicas of the repair coordinator, and some nodes can be replicas of multiple token ranges. It would mean that the repair coordinator can send more than RF or RF*RF simultaneous validation requests.

      For example, in an 11-node cluster with RF=3 and 256 tokens, we have seen a repair session involving 44 groups of ranges to be repaired. This produces 44*3=132 validation requests contacting all the nodes in the cluster. When the responses for all these requests start to arrive to the coordinator, each containing up to repair_session_space/3 of Merkle trees, they accumulate quicker than they are consumed, greatly exceeding repair_session_space and OOMing the node.

      Attachments

        Issue Links

          Activity

            People

              adelapena Andres de la Peña
              adelapena Andres de la Peña
              Andres de la Peña
              David Capwell
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m