Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-2841

Always use even distribution for merkle tree with RandomPartitionner

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Low
    • Resolution: Fixed
    • Fix Version/s: 0.7.7, 0.8.2
    • Component/s: None
    • Labels:

      Description

      When creating the initial merkle tree, repair tries to be (too) smart and use the key samples to "guide" the tree splitting. While this is a good idea for OPP where there is a good change the data distribution is uneven, you can't beat an even distribution for the RandomPartitionner. And a quick experiment even shows that the method used is significantly less efficient than an even distribution for the ranges of the merkle tree (that is, an even distribution gives a much better of distribution of the number of keys by range of the tree).

      Thus let's switch to an even distribution for RandomPartitionner. That 3 lines change alone amounts for a significant improvement of repair's precision.

        Attachments

        1. 2841.patch
          3 kB
          Sylvain Lebresne

          Activity

            People

            • Assignee:
              slebresne Sylvain Lebresne
              Reporter:
              slebresne Sylvain Lebresne
              Authors:
              Sylvain Lebresne
              Reviewers:
              Jonathan Ellis
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: