Details

    Description

      Setup:

      • Cassandra: 6 (2*3 rack) node i3.8xlarge AWS instance (32 cpu cores, 240GB ram) running cassandra trunk with Jason's 14503 changes vs the same footprint running 3.0.17
      • One datacenter, single tokens
      • No compression, encryption, or coalescing turned on

      Test #1:

      ndbench loaded ~150GB of data per node into a LCS table. Then we killed a node and let a new node stream. With a single token this should be a worst case recovery scenario (only  a few peers to stream from).

      Result:

      As the table used LCS and we didn't not have encryption on, the zero copy transfer was used via CASSANDRA-14556. We recovered 150GB in 5 minutes, going at a consistent rate of about 3 gigabit per second. Theoretically we should be able to get 10 gigabit, but this is still something like an estimated 16x improvement over 3.0.x. We're still running the 3.0.x test for a hard comparison.

      Follow Ups:
      We need to get more rigorous measurements (over more terminations), as well as finishing the 3.0.x test. sumanth.pasupuleti and djoshi3 are driving this.

      Attachments

        1. image-2018-11-06-13-34-33-108.png
          13 kB
          Joey Lynch
        2. 3.0.17-4.0.x-Streaming.png
          13 kB
          Joey Lynch
        3. streaming_benchmarking.patch
          7 kB
          Sumanth Pasupuleti
        4. cassandra_streaming.png
          14 kB
          Sumanth Pasupuleti

        Activity

          People

            sumanth.pasupuleti Sumanth Pasupuleti
            jolynch Joey Lynch
            Sumanth Pasupuleti
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: