Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-13489

Cassandra Repair in 2.2.8

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Urgent
    • Resolution: Feedback Received
    • None
    • None
    • None
    • Linux Redhat Enterprise, 32 GB RAM,

    Description

      I am using 4 node, 2 data center Cassandra cluster. While running nodetool repair -dcpar it takes arund 2 hours 30 minutes for database size of 20 MB. I checked how to tune streaming of data between data centers from below url:

      https://support.datastax.com/hc/en-us/articles/205409646-How-to-performance-tune-data-streaming-activities-like-repair-and-bootstrap

      But still the repair takes 2 hours and 30 mins. I drilled down the repair logs and identified while repair Cassandra repairs 256 ranges per node which is 4*256=1024. In a single token range merkle tree for each column families is compared which takes around 110 ms. We have 80 column families thus it takes 110*80*1024 which results in 2 hours 30 mins.

      Can we reduce number of traffic by generating merkle tree for more than one column family at a time?

      Or is there any other way to reduce the repair procedure in Cassandra 2.2.

      Attachments

        Activity

          People

            Unassigned Unassigned
            ShobanSundar Shoban
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: