Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-5419

Employ column differencing (as done for read repairs) during node repairs

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Normal
    • Resolution: Duplicate
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Environment:

      Production

      Description

      In particular for wide rows, the headroom required for node repairs can be substantial given that entire rows are streamed for any and all row hash discrepancies.

      This headroom must be sustained until compaction slowly compacts these newly streamed SSTables and reduces the overall load on each instance.

      The overall footprint of node repairs would be greatly reduced if we employed differencing at the column level and sent over row mutations, similar to what is done during read repair. This is a great alternative for deployments wherein sending over entire rows rather than the deltas is not an option.

      Since node repairs can now specify start and end tokens (i.e. subrange repairs), the additional computation can be broken down easily, and it's a welcome trade-off for significantly less streaming, compaction, and temporary headroom requirements.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                abashir Ahmed Bashir
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: