Cassandra
  1. Cassandra
  2. CASSANDRA-5582

Replace CustomHsHaServer with better optimized solution based on LMAX Disruptor

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 2.0 beta 1
    • Component/s: API, Core
    • Labels:
      None

      Description

      I have been working on https://github.com/xedin/disruptor_thrift_server and consider it as stable and performant enough for integration with Cassandra.

      Proposed replacement can work in both on/off Heap modes (depending if JNA is available) and doesn't blindly reallocate things, which allows to resolve CASSANDRA-4265 as "Won't Fix".

      1. Pavel's Patch.rtf
        9 kB
        Vijay
      2. CASSANDRA-5530-invoker-fix.patch
        3 kB
        Pavel Yaskevich
      3. disruptor-3.0.1.jar
        65 kB
        Pavel Yaskevich
      4. disruptor-thrift-0.1-SNAPSHOT.jar
        116 kB
        Pavel Yaskevich
      5. CASSANDRA-5582.patch
        11 kB
        Pavel Yaskevich

        Issue Links

          Activity

          Hide
          Pavel Yaskevich added a comment -

          Aleksey Yeschenko and Vijay can you guys please attach benchmark numbers on your hardware to compare?

          Show
          Pavel Yaskevich added a comment - Aleksey Yeschenko and Vijay can you guys please attach benchmark numbers on your hardware to compare?
          Hide
          Vijay added a comment - - edited

          Hi Pavel and All,

          I should say this patch is Awesome!

          Overall i saw "> 30%" improvement at-least for most cases. The tests where done with 32 Core Intel 128GB and SSD Drives.
          For 4 Nodes i was able to do 100K Reads Per Sec with this patch, before (trunk) it was less than 50K.

          But i have to say it exposes other issues in the Storage Proxy layer...
          For example i was noticing write failing silently without hints been stored etc, I did some digging and it was not due to this patch though.

          Show
          Vijay added a comment - - edited Hi Pavel and All, I should say this patch is Awesome! Overall i saw "> 30%" improvement at-least for most cases. The tests where done with 32 Core Intel 128GB and SSD Drives. For 4 Nodes i was able to do 100K Reads Per Sec with this patch, before (trunk) it was less than 50K. But i have to say it exposes other issues in the Storage Proxy layer... For example i was noticing write failing silently without hints been stored etc, I did some digging and it was not due to this patch though.
          Hide
          Jonathan Ellis added a comment -
          Show
          Jonathan Ellis added a comment - Pavel Yaskevich , what are your thoughts on integrating disruptor at the StorageProxy and MessagingService layer as well? https://issues.apache.org/jira/browse/CASSANDRA-4718?focusedCommentId=13629447&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13629447
          Hide
          T Jake Luciani added a comment -

          I tried this and was surprised to see the trunk hsha was faster than this patch...

          Here are my numbers

          Trunk HSHA

          jluciani:tools/ (trunk✗) $ ./bin/cassandra-stress -n 4000000                                                                                                                             [21:43:42]
          Created keyspaces. Sleeping 1s for propagation.
          total,interval_op_rate,interval_key_rate,latency,95th,99th,elapsed_time
          104636,10463,10463,2.8,12.8,93.3,10
          264277,15964,15964,2.5,7.4,73.4,20
          400862,13658,13658,2.6,7.4,73.4,30
          551484,15062,15062,2.6,7.2,113.4,40
          656363,10487,10487,2.6,8.3,113.4,51
          787972,13160,13160,2.7,8.7,62.4,61
          943934,15596,15596,2.6,6.9,62.3,71
          1086619,14268,14268,2.6,6.5,62.3,81
          1196122,10950,10950,2.6,6.9,125.7,92
          1335264,13914,13914,2.6,7.1,125.7,102
          1506565,17130,17130,2.5,6.0,113.3,112
          1677018,17045,17045,2.4,5.7,113.3,122
          1846083,16906,16906,2.4,5.6,111.5,132
          2028642,18255,18255,2.3,5.5,111.5,143
          2174955,14631,14631,2.3,5.5,112.7,153
          2276418,10146,10146,2.4,5.7,112.7,163
          2382601,10618,10618,2.4,6.5,112.7,173
          2526749,14414,14414,2.5,6.0,112.7,184
          2655834,12908,12908,2.6,7.0,112.7,194
          2791234,13540,13540,2.6,6.9,212.5,204
          2980432,18919,18919,2.5,5.9,213.0,214
          3152839,17240,17240,2.5,5.8,213.0,224
          3322844,17000,17000,2.5,5.8,215.2,235
          3505263,18241,18241,2.5,5.5,215.2,245
          3684394,17913,17913,2.3,5.3,215.2,255
          3863978,17958,17958,2.2,5.1,215.2,265
          4000000,13602,13602,2.0,5.1,202.3,275
          
          
          Averages from the middle 80% of values:
          interval_op_rate          : 14731
          interval_key_rate         : 14731
          latency median            : 2.5
          latency 95th percentile   : 6.5
          latency 99.9th percentile : 128.2
          Total operation time      : 00:04:35
          END
          jluciani:tools/ (trunk✗) $ ./bin/cassandra-stress -n 4000000 -o read                                                                                                                     [21:49:14]
          total,interval_op_rate,interval_key_rate,latency,95th,99th,elapsed_time
          105128,10512,10512,3.4,12.6,45.2,10
          268987,16385,16385,3.0,5.8,32.8,20
          443397,17441,17441,2.9,5.0,32.8,30
          626750,18335,18335,2.7,4.6,32.8,40
          812642,18589,18589,2.6,4.4,32.8,50
          1002012,18937,18937,2.6,4.2,32.8,61
          1194865,19285,19285,2.5,4.1,32.8,71
          1385800,19093,19093,2.5,4.1,36.0,81
          1579409,19360,19360,2.5,4.0,33.2,91
          1777652,19824,19824,2.5,3.9,33.2,101
          1975122,19747,19747,2.4,3.8,33.2,112
          2174320,19919,19919,2.4,3.8,33.2,122
          2356130,18181,18181,2.4,3.8,33.2,132
          2553115,19698,19698,2.4,3.8,33.2,142
          2750741,19762,19762,2.4,3.8,33.5,152
          2951876,20113,20113,2.4,3.8,33.5,163
          3143606,19173,19173,2.4,3.8,33.5,173
          3340625,19701,19701,2.4,3.7,33.5,183
          3541574,20094,20094,2.4,3.7,32.8,193
          3741325,19975,19975,2.3,3.7,32.8,203
          3926265,18494,18494,2.2,3.5,32.8,214
          4000000,7373,7373,2.2,3.5,32.8,219
          
          Averages from the middle 80% of values:
          interval_op_rate          : 19250
          interval_key_rate         : 19250
          latency median            : 2.5
          latency 95th percentile   : 4.0
          latency 99.9th percentile : 33.3
          Total operation time      : 00:03:39
          END
          
          

          This patch

          jluciani:tools/ (trunk✗) $ ./bin/cassandra-stress -n 4000000                                                                                                                             [21:57:09]
          Created keyspaces. Sleeping 1s for propagation.
          total,interval_op_rate,interval_key_rate,latency,95th,99th,elapsed_time
          53262,5326,5326,5.5,25.4,158.3,10
          192313,13905,13905,3.2,13.4,158.0,20
          324275,13196,13196,3.2,9.0,83.0,30
          447315,12304,12304,3.2,7.0,83.0,40
          564204,11688,11688,3.2,7.1,82.9,51
          677304,11310,11310,3.3,7.0,82.9,61
          805181,12787,12787,3.3,7.0,82.9,71
          955200,15001,15001,3.1,6.3,82.6,81
          1074254,11905,11905,3.2,6.3,82.6,91
          1175225,10097,10097,3.2,6.5,113.4,102
          1307670,13244,13244,3.2,6.4,113.4,112
          1456159,14848,14848,3.1,6.2,82.6,122
          1604168,14800,14800,3.1,5.9,142.0,132
          1744811,14064,14064,3.0,5.5,142.0,142
          1916217,17140,17140,2.9,5.4,143.1,153
          2057517,14130,14130,2.9,5.4,143.1,163
          2171001,11348,11348,2.9,5.6,143.1,173
          2269043,9804,9804,3.1,6.4,143.1,183
          2366316,9727,9727,3.2,6.9,143.1,193
          2459820,9350,9350,3.3,7.5,143.1,204
          2550311,9049,9049,3.4,8.5,143.1,214
          2637440,8712,8712,3.5,8.1,141.4,224
          2734691,9725,9725,3.6,8.1,141.4,234
          2861857,12716,12716,3.5,7.6,141.4,245
          3006154,14429,14429,3.3,7.3,143.2,255
          3120329,11417,11417,3.3,7.3,155.9,265
          3256086,13575,13575,3.2,7.0,155.9,275
          3367218,11113,11113,3.3,6.9,155.9,285
          3511295,14407,14407,3.2,6.5,153.5,296
          3644578,13328,13328,3.1,6.5,153.5,306
          3776019,13144,13144,3.0,6.4,156.7,316
          3888436,11241,11241,2.8,6.2,156.7,326
          4000000,11156,11156,2.6,6.1,156.7,335
          
          
          Averages from the middle 80% of values:
          interval_op_rate          : 12257
          interval_key_rate         : 12257
          latency median            : 3.2
          latency 95th percentile   : 6.8
          latency 99.9th percentile : 126.2
          Total operation time      : 00:05:35
          END
          jluciani:tools/ (trunk✗) $ ./bin/cassandra-stress -n 4000000 -o read                                                                                                                     [22:25:10]
          total,interval_op_rate,interval_key_rate,latency,95th,99th,elapsed_time
          64286,6428,6428,4.9,20.9,129.1,10
          180799,11651,11651,4.1,12.8,84.1,20
          303741,12294,12294,3.9,9.6,84.0,30
          434113,13037,13037,3.8,7.4,84.3,40
          564535,13042,13042,3.7,7.3,84.4,51
          701376,13684,13684,3.6,6.8,39.0,61
          831301,12992,12992,3.5,6.2,33.8,71
          958285,12698,12698,3.5,6.4,33.8,81
          1095661,13737,13737,3.4,6.2,33.8,91
          1230110,13444,13444,3.4,6.0,33.8,102
          1366013,13590,13590,3.4,6.2,33.8,112
          1503292,13727,13727,3.3,5.9,33.8,122
          1628990,12569,12569,3.3,5.9,33.9,132
          1768176,13918,13918,3.3,6.0,33.9,142
          1904173,13599,13599,3.3,5.9,33.9,153
          2035891,13171,13171,3.3,6.1,32.0,163
          2178002,14211,14211,3.2,6.0,32.0,173
          2311208,13320,13320,3.2,5.9,32.0,183
          2459054,14784,14784,3.2,5.8,32.0,193
          2603760,14470,14470,3.2,5.6,32.0,204
          2750140,14638,14638,3.1,5.5,31.5,214
          2898530,14839,14839,3.1,5.4,31.5,224
          3046585,14805,14805,3.1,5.4,31.5,234
          3191439,14485,14485,3.1,5.4,47.8,244
          3339831,14839,14839,3.1,5.4,47.9,255
          3487253,14742,14742,3.1,5.4,47.9,265
          3637535,15028,15028,3.1,5.2,47.9,275
          3784912,14737,14737,3.1,5.3,47.9,285
          3934077,14916,14916,3.0,5.2,47.9,295
          4000000,6592,6592,2.9,5.2,47.9,300
          
          
          Averages from the middle 80% of values:
          interval_op_rate          : 13840
          interval_key_rate         : 13840
          latency median            : 3.3
          latency 95th percentile   : 6.0
          latency 99.9th percentile : 39.6
          Total operation time      : 00:05:00
          END
          
          
          Show
          T Jake Luciani added a comment - I tried this and was surprised to see the trunk hsha was faster than this patch... Here are my numbers Trunk HSHA jluciani:tools/ (trunk✗) $ ./bin/cassandra-stress -n 4000000 [21:43:42] Created keyspaces. Sleeping 1s for propagation. total,interval_op_rate,interval_key_rate,latency,95th,99th,elapsed_time 104636,10463,10463,2.8,12.8,93.3,10 264277,15964,15964,2.5,7.4,73.4,20 400862,13658,13658,2.6,7.4,73.4,30 551484,15062,15062,2.6,7.2,113.4,40 656363,10487,10487,2.6,8.3,113.4,51 787972,13160,13160,2.7,8.7,62.4,61 943934,15596,15596,2.6,6.9,62.3,71 1086619,14268,14268,2.6,6.5,62.3,81 1196122,10950,10950,2.6,6.9,125.7,92 1335264,13914,13914,2.6,7.1,125.7,102 1506565,17130,17130,2.5,6.0,113.3,112 1677018,17045,17045,2.4,5.7,113.3,122 1846083,16906,16906,2.4,5.6,111.5,132 2028642,18255,18255,2.3,5.5,111.5,143 2174955,14631,14631,2.3,5.5,112.7,153 2276418,10146,10146,2.4,5.7,112.7,163 2382601,10618,10618,2.4,6.5,112.7,173 2526749,14414,14414,2.5,6.0,112.7,184 2655834,12908,12908,2.6,7.0,112.7,194 2791234,13540,13540,2.6,6.9,212.5,204 2980432,18919,18919,2.5,5.9,213.0,214 3152839,17240,17240,2.5,5.8,213.0,224 3322844,17000,17000,2.5,5.8,215.2,235 3505263,18241,18241,2.5,5.5,215.2,245 3684394,17913,17913,2.3,5.3,215.2,255 3863978,17958,17958,2.2,5.1,215.2,265 4000000,13602,13602,2.0,5.1,202.3,275 Averages from the middle 80% of values: interval_op_rate : 14731 interval_key_rate : 14731 latency median : 2.5 latency 95th percentile : 6.5 latency 99.9th percentile : 128.2 Total operation time : 00:04:35 END jluciani:tools/ (trunk✗) $ ./bin/cassandra-stress -n 4000000 -o read [21:49:14] total,interval_op_rate,interval_key_rate,latency,95th,99th,elapsed_time 105128,10512,10512,3.4,12.6,45.2,10 268987,16385,16385,3.0,5.8,32.8,20 443397,17441,17441,2.9,5.0,32.8,30 626750,18335,18335,2.7,4.6,32.8,40 812642,18589,18589,2.6,4.4,32.8,50 1002012,18937,18937,2.6,4.2,32.8,61 1194865,19285,19285,2.5,4.1,32.8,71 1385800,19093,19093,2.5,4.1,36.0,81 1579409,19360,19360,2.5,4.0,33.2,91 1777652,19824,19824,2.5,3.9,33.2,101 1975122,19747,19747,2.4,3.8,33.2,112 2174320,19919,19919,2.4,3.8,33.2,122 2356130,18181,18181,2.4,3.8,33.2,132 2553115,19698,19698,2.4,3.8,33.2,142 2750741,19762,19762,2.4,3.8,33.5,152 2951876,20113,20113,2.4,3.8,33.5,163 3143606,19173,19173,2.4,3.8,33.5,173 3340625,19701,19701,2.4,3.7,33.5,183 3541574,20094,20094,2.4,3.7,32.8,193 3741325,19975,19975,2.3,3.7,32.8,203 3926265,18494,18494,2.2,3.5,32.8,214 4000000,7373,7373,2.2,3.5,32.8,219 Averages from the middle 80% of values: interval_op_rate : 19250 interval_key_rate : 19250 latency median : 2.5 latency 95th percentile : 4.0 latency 99.9th percentile : 33.3 Total operation time : 00:03:39 END This patch jluciani:tools/ (trunk✗) $ ./bin/cassandra-stress -n 4000000 [21:57:09] Created keyspaces. Sleeping 1s for propagation. total,interval_op_rate,interval_key_rate,latency,95th,99th,elapsed_time 53262,5326,5326,5.5,25.4,158.3,10 192313,13905,13905,3.2,13.4,158.0,20 324275,13196,13196,3.2,9.0,83.0,30 447315,12304,12304,3.2,7.0,83.0,40 564204,11688,11688,3.2,7.1,82.9,51 677304,11310,11310,3.3,7.0,82.9,61 805181,12787,12787,3.3,7.0,82.9,71 955200,15001,15001,3.1,6.3,82.6,81 1074254,11905,11905,3.2,6.3,82.6,91 1175225,10097,10097,3.2,6.5,113.4,102 1307670,13244,13244,3.2,6.4,113.4,112 1456159,14848,14848,3.1,6.2,82.6,122 1604168,14800,14800,3.1,5.9,142.0,132 1744811,14064,14064,3.0,5.5,142.0,142 1916217,17140,17140,2.9,5.4,143.1,153 2057517,14130,14130,2.9,5.4,143.1,163 2171001,11348,11348,2.9,5.6,143.1,173 2269043,9804,9804,3.1,6.4,143.1,183 2366316,9727,9727,3.2,6.9,143.1,193 2459820,9350,9350,3.3,7.5,143.1,204 2550311,9049,9049,3.4,8.5,143.1,214 2637440,8712,8712,3.5,8.1,141.4,224 2734691,9725,9725,3.6,8.1,141.4,234 2861857,12716,12716,3.5,7.6,141.4,245 3006154,14429,14429,3.3,7.3,143.2,255 3120329,11417,11417,3.3,7.3,155.9,265 3256086,13575,13575,3.2,7.0,155.9,275 3367218,11113,11113,3.3,6.9,155.9,285 3511295,14407,14407,3.2,6.5,153.5,296 3644578,13328,13328,3.1,6.5,153.5,306 3776019,13144,13144,3.0,6.4,156.7,316 3888436,11241,11241,2.8,6.2,156.7,326 4000000,11156,11156,2.6,6.1,156.7,335 Averages from the middle 80% of values: interval_op_rate : 12257 interval_key_rate : 12257 latency median : 3.2 latency 95th percentile : 6.8 latency 99.9th percentile : 126.2 Total operation time : 00:05:35 END jluciani:tools/ (trunk✗) $ ./bin/cassandra-stress -n 4000000 -o read [22:25:10] total,interval_op_rate,interval_key_rate,latency,95th,99th,elapsed_time 64286,6428,6428,4.9,20.9,129.1,10 180799,11651,11651,4.1,12.8,84.1,20 303741,12294,12294,3.9,9.6,84.0,30 434113,13037,13037,3.8,7.4,84.3,40 564535,13042,13042,3.7,7.3,84.4,51 701376,13684,13684,3.6,6.8,39.0,61 831301,12992,12992,3.5,6.2,33.8,71 958285,12698,12698,3.5,6.4,33.8,81 1095661,13737,13737,3.4,6.2,33.8,91 1230110,13444,13444,3.4,6.0,33.8,102 1366013,13590,13590,3.4,6.2,33.8,112 1503292,13727,13727,3.3,5.9,33.8,122 1628990,12569,12569,3.3,5.9,33.9,132 1768176,13918,13918,3.3,6.0,33.9,142 1904173,13599,13599,3.3,5.9,33.9,153 2035891,13171,13171,3.3,6.1,32.0,163 2178002,14211,14211,3.2,6.0,32.0,173 2311208,13320,13320,3.2,5.9,32.0,183 2459054,14784,14784,3.2,5.8,32.0,193 2603760,14470,14470,3.2,5.6,32.0,204 2750140,14638,14638,3.1,5.5,31.5,214 2898530,14839,14839,3.1,5.4,31.5,224 3046585,14805,14805,3.1,5.4,31.5,234 3191439,14485,14485,3.1,5.4,47.8,244 3339831,14839,14839,3.1,5.4,47.9,255 3487253,14742,14742,3.1,5.4,47.9,265 3637535,15028,15028,3.1,5.2,47.9,275 3784912,14737,14737,3.1,5.3,47.9,285 3934077,14916,14916,3.0,5.2,47.9,295 4000000,6592,6592,2.9,5.2,47.9,300 Averages from the middle 80% of values: interval_op_rate : 13840 interval_key_rate : 13840 latency median : 3.3 latency 95th percentile : 6.0 latency 99.9th percentile : 39.6 Total operation time : 00:05:00 END
          Hide
          Vijay added a comment -

          T Jake Luciani whats the hardware? do you have JNA?

          Show
          Vijay added a comment - T Jake Luciani whats the hardware? do you have JNA?
          Hide
          Pavel Yaskevich added a comment -

          T Jake Luciani just for the record, I didn't change yaml in my patch, so you could be comparing sync with patch to hsha.

          Show
          Pavel Yaskevich added a comment - T Jake Luciani just for the record, I didn't change yaml in my patch, so you could be comparing sync with patch to hsha.
          Hide
          T Jake Luciani added a comment -

          I'm on a MBP. The log said it was using the THsHaDisruptorServer. Were you guys testing the latest trunk? Based on pavels numbers it looks like it was using the old HSHA and not CASSANDRA-5530

          Show
          T Jake Luciani added a comment - I'm on a MBP. The log said it was using the THsHaDisruptorServer. Were you guys testing the latest trunk? Based on pavels numbers it looks like it was using the old HSHA and not CASSANDRA-5530
          Hide
          Pavel Yaskevich added a comment -

          TTreadedSelectorServer could be better in dispatching new connections comparing to standard HsHa but not handling r/w as it has a lot of synchronization which TDistruptor one doesn't, that's why it shows better latency/throughput results. I believe Vijay was testing on actual 4 node cluster, I have been testing on my laptop. Aleksey also did some testing on his old laptop in disk bound scenario.

          Show
          Pavel Yaskevich added a comment - TTreadedSelectorServer could be better in dispatching new connections comparing to standard HsHa but not handling r/w as it has a lot of synchronization which TDistruptor one doesn't, that's why it shows better latency/throughput results. I believe Vijay was testing on actual 4 node cluster, I have been testing on my laptop. Aleksey also did some testing on his old laptop in disk bound scenario.
          Hide
          T Jake Luciani added a comment -

          Vijay can you confirm you were using the new HSHA in your test? I can also test this on actual hardware.

          Pavel, Can I get you to post this server to Thrift as well?

          Show
          T Jake Luciani added a comment - Vijay can you confirm you were using the new HSHA in your test? I can also test this on actual hardware. Pavel, Can I get you to post this server to Thrift as well?
          Hide
          Aleksey Yeschenko added a comment -

          I tested pre-5530 trunk vs. Pavel's disruptor, and the difference was drastic. But yeah, not TThreadedSelectorServer. Still sure it (disruptor-bases server) is gonna be faster, but some new numbers on real hardware won't hurt. I'm gonna rerun the numbers as well, just in case.

          Show
          Aleksey Yeschenko added a comment - I tested pre-5530 trunk vs. Pavel's disruptor, and the difference was drastic. But yeah, not TThreadedSelectorServer. Still sure it (disruptor-bases server) is gonna be faster, but some new numbers on real hardware won't hurt. I'm gonna rerun the numbers as well, just in case.
          Hide
          Pavel Yaskevich added a comment -

          T Jake Luciani That's the idea about destiny of my server, I wasn't sure that Thrift will accept it with that dependency right away so I made it a separate project.

          According to "new HsHa", as I already said, the part that actually handles established connections didn't change a bit - https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/server/TThreadedSelectorServer.java in SelectorThread they still do "processInterestChanges". The only improvement I see (questionable under considerable load) is that server would be able to "accept" connections faster but "dispatch" part (when channel is actually registered with selector) still adds fair amount of contention.

          Show
          Pavel Yaskevich added a comment - T Jake Luciani That's the idea about destiny of my server, I wasn't sure that Thrift will accept it with that dependency right away so I made it a separate project. According to "new HsHa", as I already said, the part that actually handles established connections didn't change a bit - https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/server/TThreadedSelectorServer.java in SelectorThread they still do "processInterestChanges". The only improvement I see (questionable under considerable load) is that server would be able to "accept" connections faster but "dispatch" part (when channel is actually registered with selector) still adds fair amount of contention.
          Hide
          Pavel Yaskevich added a comment -

          I could have introduced degradation with my JMX add so I will double-check everything, run tests on new machine and post updates.

          Show
          Pavel Yaskevich added a comment - I could have introduced degradation with my JMX add so I will double-check everything, run tests on new machine and post updates.
          Hide
          Pavel Yaskevich added a comment -

          modified my patch to use custom ExecutorService which is faster then just fixed side one. Also added comments to the CASSANDRA-5530 why stress tests show that current HsHa is faster than this implementation.

          Show
          Pavel Yaskevich added a comment - modified my patch to use custom ExecutorService which is faster then just fixed side one. Also added comments to the CASSANDRA-5530 why stress tests show that current HsHa is faster than this implementation.
          Hide
          Pavel Yaskevich added a comment -

          Jonathan Ellis According to your question about StorageProxy and MessagingService - I'm not sure about SP but I will take a look, in case of MS - it should be make async (client and server) and disruptor (as in case of Thrift) could also work good as dispatching facility.

          Show
          Pavel Yaskevich added a comment - Jonathan Ellis According to your question about StorageProxy and MessagingService - I'm not sure about SP but I will take a look, in case of MS - it should be make async (client and server) and disruptor (as in case of Thrift) could also work good as dispatching facility.
          Hide
          Jonathan Ellis added a comment -

          My understanding was that Disruptor really wants to manage the whole lifecycle, e.g. Thrift -> SP -> MS in one Disruptor. No?

          Show
          Jonathan Ellis added a comment - My understanding was that Disruptor really wants to manage the whole lifecycle, e.g. Thrift -> SP -> MS in one Disruptor. No?
          Hide
          Vijay added a comment -

          My understanding was that Disruptor really wants to manage the whole lifecycle, e.g. Thrift -> SP -> MS

          I think that will happen after we commit CASSANDRA-5239, Since we can have a ListenableFuture which will change the intrest in thrift/netty etc.

          What i noticed is that, Disruptor in Stages will be a bad idea since it spins through the Buffers and takes a lot of CPU cycles without any big advantage... I am sure Pavel will be able to give more details or differ

          BTW: i am going to run the test again in few min!

          Show
          Vijay added a comment - My understanding was that Disruptor really wants to manage the whole lifecycle, e.g. Thrift -> SP -> MS I think that will happen after we commit CASSANDRA-5239 , Since we can have a ListenableFuture which will change the intrest in thrift/netty etc. What i noticed is that, Disruptor in Stages will be a bad idea since it spins through the Buffers and takes a lot of CPU cycles without any big advantage... I am sure Pavel will be able to give more details or differ BTW: i am going to run the test again in few min!
          Hide
          Pavel Yaskevich added a comment -

          T Jake Luciani and all, here is the patch to fix the problem with current trunk implementation of HsHa server, it actually makes use of invoker queue. With that patch stress gives ~22K reads on my machine where proposed disruptor based solution gives ~42K.

          Show
          Pavel Yaskevich added a comment - T Jake Luciani and all, here is the patch to fix the problem with current trunk implementation of HsHa server, it actually makes use of invoker queue. With that patch stress gives ~22K reads on my machine where proposed disruptor based solution gives ~42K.
          Hide
          Pavel Yaskevich added a comment -

          changed package name from tinkerpop to thinkaurelius.

          Show
          Pavel Yaskevich added a comment - changed package name from tinkerpop to thinkaurelius.
          Hide
          Pavel Yaskevich added a comment -

          Changing to "In Progress" while I work on potential improvement.

          Show
          Pavel Yaskevich added a comment - Changing to "In Progress" while I work on potential improvement.
          Hide
          Pavel Yaskevich added a comment -

          updated patch/server that uses ring buffer per selector and updated disruptor to 3.0.1

          Show
          Pavel Yaskevich added a comment - updated patch/server that uses ring buffer per selector and updated disruptor to 3.0.1
          Hide
          Pavel Yaskevich added a comment -

          patch is rebased with latest trunk and server is defaults are changed. I'm pretty confident about everything right now, tested again on 4 node cluster (RF=3) and p999 latencies are close to p95 and stable.

          Show
          Pavel Yaskevich added a comment - patch is rebased with latest trunk and server is defaults are changed. I'm pretty confident about everything right now, tested again on 4 node cluster (RF=3) and p999 latencies are close to p95 and stable.
          Hide
          Pavel Yaskevich added a comment -

          So the numbers (4 node cluster (16 physical CPU, 128GB RAM), RF=3, 2,000,000 keys (stress -C 150 -S 512 -V)), stress was run on the separate machine.

          THsHaDisruptorServer (reads, QUORUM) average run

          517758,51775,51775,0.9,1.5,9.1,10
          1070470,55271,55271,0.8,1.5,3.4,20
          1660087,58961,58961,0.8,1.4,3.4,30
          1942881,28279,28279,0.8,1.3,4.1,40
          2000000,5711,5711,0.8,1.3,8.1,44
          
          
          Averages from the middle 80% of values:
          interval_op_rate          : 55335
          interval_key_rate         : 55335
          latency median            : 0.8
          latency 95th percentile   : 1.4
          latency 99.9th percentile : 5.6
          Total operation time      : 00:00:44
          

          TThreadedSelectorServer (reads, QUORUM) average run with invoker patch applied

          544080,54408,54408,0.7,1.5,26.8,10
          1070755,52667,52667,0.8,1.5,3.3,20
          1656322,58556,58556,0.7,1.5,45.8,30
          1799710,14338,14338,0.7,1.5,45.8,40
          1956230,15652,15652,0.8,1.5,45.8,50
          2000000,4377,4377,0.8,1.4,45.8,54
          
          
          Averages from the middle 80% of values:
          interval_op_rate          : 44992
          interval_key_rate         : 44992
          latency median            : 0.7
          latency 95th percentile   : 1.5
          latency 99.9th percentile : 30.4
          Total operation time      : 00:00:54
          

          Also, it's important to node that TThreadedSelectorServer was throwing few timeout exceptions (5-10) in 1 of 4 runs.

          Show
          Pavel Yaskevich added a comment - So the numbers (4 node cluster (16 physical CPU, 128GB RAM), RF=3, 2,000,000 keys (stress -C 150 -S 512 -V)), stress was run on the separate machine. THsHaDisruptorServer (reads, QUORUM) average run 517758,51775,51775,0.9,1.5,9.1,10 1070470,55271,55271,0.8,1.5,3.4,20 1660087,58961,58961,0.8,1.4,3.4,30 1942881,28279,28279,0.8,1.3,4.1,40 2000000,5711,5711,0.8,1.3,8.1,44 Averages from the middle 80% of values: interval_op_rate : 55335 interval_key_rate : 55335 latency median : 0.8 latency 95th percentile : 1.4 latency 99.9th percentile : 5.6 Total operation time : 00:00:44 TThreadedSelectorServer (reads, QUORUM) average run with invoker patch applied 544080,54408,54408,0.7,1.5,26.8,10 1070755,52667,52667,0.8,1.5,3.3,20 1656322,58556,58556,0.7,1.5,45.8,30 1799710,14338,14338,0.7,1.5,45.8,40 1956230,15652,15652,0.8,1.5,45.8,50 2000000,4377,4377,0.8,1.4,45.8,54 Averages from the middle 80% of values: interval_op_rate : 44992 interval_key_rate : 44992 latency median : 0.7 latency 95th percentile : 1.5 latency 99.9th percentile : 30.4 Total operation time : 00:00:54 Also, it's important to node that TThreadedSelectorServer was throwing few timeout exceptions (5-10) in 1 of 4 runs.
          Hide
          T Jake Luciani added a comment -

          this looks really good. how does the off heap protocol fair? we would need the project hosted in maven somewhere to release it.

          Show
          T Jake Luciani added a comment - this looks really good. how does the off heap protocol fair? we would need the project hosted in maven somewhere to release it.
          Hide
          Pavel Yaskevich added a comment -

          Can you clarify what do you mean by fair? I think I will be able to host it on Sonatype OSS Repository or Maven Central.

          Show
          Pavel Yaskevich added a comment - Can you clarify what do you mean by fair? I think I will be able to host it on Sonatype OSS Repository or Maven Central.
          Hide
          Jonathan Ellis added a comment -

          He meant "How does the off heap protocol fare [perform]."

          Show
          Jonathan Ellis added a comment - He meant "How does the off heap protocol fare [perform] ."
          Hide
          Pavel Yaskevich added a comment -

          Ah, I'm sorry, so the numbers I have posted are produced using off-heap frame buffers, they actually give around 2,000-3,000 more operations per second and stabler p999 latencies.

          Show
          Pavel Yaskevich added a comment - Ah, I'm sorry, so the numbers I have posted are produced using off-heap frame buffers, they actually give around 2,000-3,000 more operations per second and stabler p999 latencies.
          Hide
          Jeremiah Jordan added a comment -
          Show
          Jeremiah Jordan added a comment - Does a new ticket need to be opened up for https://issues.apache.org/jira/secure/attachment/12583943/CASSANDRA-5530-invoker-fix.patch ?
          Hide
          Pavel Yaskevich added a comment -

          Jeremiah Jordan I don't think we need a separate ticket for that, I can commit that fix to the trunk right now but we are going to be replacing TThreadedSelectorServer there anyway so I don't know if I should even do that.

          Show
          Pavel Yaskevich added a comment - Jeremiah Jordan I don't think we need a separate ticket for that, I can commit that fix to the trunk right now but we are going to be replacing TThreadedSelectorServer there anyway so I don't know if I should even do that.
          Hide
          T Jake Luciani added a comment -

          So for maven deps I see http://search.maven.org/#artifactdetails%7Ccom.lmax%7Cdisruptor%7C3.0.1%7Cjar but can you get your service up somewhere?

          Show
          T Jake Luciani added a comment - So for maven deps I see http://search.maven.org/#artifactdetails%7Ccom.lmax%7Cdisruptor%7C3.0.1%7Cjar but can you get your service up somewhere?
          Hide
          Pavel Yaskevich added a comment -

          disruptor itself is publicly hosted but I was wonder, can't we just include my thing into lib/ the same way as jamm?

          Show
          Pavel Yaskevich added a comment - disruptor itself is publicly hosted but I was wonder, can't we just include my thing into lib/ the same way as jamm?
          Hide
          Pavel Yaskevich added a comment - - edited

          anyways, I will be adding it to Sonatype OSS.

          Edit: created https://issues.sonatype.org/browse/OSSRH-6297 for that.

          Show
          Pavel Yaskevich added a comment - - edited anyways, I will be adding it to Sonatype OSS. Edit: created https://issues.sonatype.org/browse/OSSRH-6297 for that.
          Hide
          Pavel Yaskevich added a comment -

          pushed snapshot/release to sonatype repo, snapshot is already available at https://oss.sonatype.org/content/repositories/snapshots/com/thinkaurelius/thrift/thrift-server/0.1-SNAPSHOT/, release is at staging.

          groupId: 'com.thinkaurelius.thrift', artifactId: 'thrift-server', version: '0.1'

          Show
          Pavel Yaskevich added a comment - pushed snapshot/release to sonatype repo, snapshot is already available at https://oss.sonatype.org/content/repositories/snapshots/com/thinkaurelius/thrift/thrift-server/0.1-SNAPSHOT/ , release is at staging. groupId: 'com.thinkaurelius.thrift', artifactId: 'thrift-server', version: '0.1'
          Hide
          Aleksey Yeschenko added a comment -

          +1

          Show
          Aleksey Yeschenko added a comment - +1
          Hide
          Pavel Yaskevich added a comment -
          Show
          Pavel Yaskevich added a comment - Is now available at http://search.maven.org/#artifactdetails%7Ccom.thinkaurelius.thrift%7Cthrift-server%7C0.1%7Cjar , i will prepare changes to build.xml
          Hide
          Pavel Yaskevich added a comment -

          Pushed to the trunk, with updated build.xml and 0.1 release jar in lib/, resolving.

          Show
          Pavel Yaskevich added a comment - Pushed to the trunk, with updated build.xml and 0.1 release jar in lib/, resolving.

            People

            • Assignee:
              Pavel Yaskevich
              Reporter:
              Pavel Yaskevich
              Reviewer:
              Aleksey Yeschenko
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development