Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13780

ADD Node streaming throughput performance

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • 3.0.x
    • Legacy/Core
    • None

    Description

      Problem: Adding a new node to a large cluster runs at least 1000x slower than what the network and node hardware capacity can support, taking several days per new node. Adjusting stream throughput and other YAML parameters seems to have no effect on performance. Essentially, it appears that Cassandra has an architecture scalability growth problem when adding new nodes to a moderate to high data ingestion cluster because Cassandra cannot add new node capacity fast enough to keep up with increasing data ingestion volumes and growth.

      Initial Configuration:
      Running 3.0.9 and have implemented TWCS on one of our largest table.
      Largest table partitioned on (ID, YYYYMM) using 1 day buckets with a TTL of 60 days.
      Next release will change partitioning to (ID, YYYYMMDD) so that partitions are aligned with daily TWCS buckets.
      Each node is currently creating roughly a 30GB SSTable per day.
      TWCS working as expected, daily SSTables are dropping off daily after 70 days ( 60 + 10 day grace)
      Current deployment is a 28 node 2 datacenter cluster, 14 nodes in each DC , replication factor 3
      Data directories are backed with 4 - 2TB SSDs on each node and a 1 800GB SSD for commit logs.
      Requirement is to double cluster size, capacity, and ingestion volume within a few weeks.

      Observed Behavior:
      1. streaming throughput during add node – we observed maximum 6 Mb/s streaming from each of the 14 nodes on a 20Gb/s switched network, taking at least 106 hours for each node to join cluster and each node is only about 2.2 TB is size.

      2. compaction on the newly added node - compaction has fallen behind, with anywhere from 4,000 to 10,000 SSTables at any given time. It took 3 weeks for compaction to finish on each newly added node. Increasing number of compaction threads to match number of CPU (40) and increasing compaction throughput to 32MB/s seemed to be the sweet spot.

      3. TWCS buckets on new node, data streamed to this node over 4 1/2 days. Compaction correctly placed the data in daily files, but the problem is the file dates reflect when compaction created the file and not the date of the last record written in the TWCS bucket, which will cause the files to remain around much longer than necessary.

      Two Questions:
      1. What can be done to substantially improve the performance of adding a new node?
      2. Can compaction on TWCS partitions for newly added nodes change the file create date to match the highest date record in the file or add another piece of meta-data to the TWCS files that reflect the file drop date so that TWCS partitions can be dropped consistently?

      Attachments

        Activity

          People

            Unassigned Unassigned
            krivai442 Kevin Rivait
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: