Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.2.0
    • Component/s: Performance, SSL
    • Labels:
      None

      Description

      From Ilya Grigorik in TS-2365:

      FWIW, I think you may be interested in this discussion:

      In a nutshell, static record size introduces an inherent tradeoff between latency and throughput – smaller records are good for latency, but hurt server throughput by adding bytes and CPU overhead. It would be great if we could implement a smarter strategy in ATS. The extra benefit is that it's one less knob to tune: the out-of-the-box experience would be better optimized for all ATS users, regardless of mix/type of traffic being proxies.

      1. TS-2503.diff
        7 kB
        Sudheer Vinukonda

        Issue Links

          Activity

          Hide
          sudheerv Sudheer Vinukonda added a comment -

          Changes already committed - Marking this to Resolved.

          Show
          sudheerv Sudheer Vinukonda added a comment - Changes already committed - Marking this to Resolved.
          Hide
          zwoop Leif Hedstrom added a comment -

          Can we close this as Fixed?

          Show
          zwoop Leif Hedstrom added a comment - Can we close this as Fixed?
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit e25f0ac0975e93bd1a3512b0bda9aeba7ce0b7b5 in trafficserver's branch refs/heads/master from Sudheer Vinukonda
          [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=e25f0ac ]

          TS-2503 : fix build failure

          Show
          jira-bot ASF subversion and git services added a comment - Commit e25f0ac0975e93bd1a3512b0bda9aeba7ce0b7b5 in trafficserver's branch refs/heads/master from Sudheer Vinukonda [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=e25f0ac ] TS-2503 : fix build failure
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 4a86672f9b7b38e1aac39c2d352031cefe004fd9 in trafficserver's branch refs/heads/master from Sudheer Vinukonda
          [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=4a86672 ]

          TS-2503 : updated per James Peach's review comments

          Show
          jira-bot ASF subversion and git services added a comment - Commit 4a86672f9b7b38e1aac39c2d352031cefe004fd9 in trafficserver's branch refs/heads/master from Sudheer Vinukonda [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=4a86672 ] TS-2503 : updated per James Peach's review comments
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit dee36e7163373c1d61bc6b97e529269dcde134c5 in trafficserver's branch refs/heads/master from Sudheer Vinukonda
          [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=dee36e7 ]

          TS-2503: Dynamic TLS Record Sizing for better page load latencies.
          The strategy used is essentially to use small TLS records that fit into a
          single TCP segment for the first ~1 MB of data, increase record size to 16 KB
          after that to optimize throughput, and then reset record size back to a single
          segment after ~1 second of inactivity—lather, rinse, repeat.

          Show
          jira-bot ASF subversion and git services added a comment - Commit dee36e7163373c1d61bc6b97e529269dcde134c5 in trafficserver's branch refs/heads/master from Sudheer Vinukonda [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=dee36e7 ] TS-2503 : Dynamic TLS Record Sizing for better page load latencies. The strategy used is essentially to use small TLS records that fit into a single TCP segment for the first ~1 MB of data, increase record size to 16 KB after that to optimize throughput, and then reset record size back to a single segment after ~1 second of inactivity—lather, rinse, repeat.
          Hide
          sudheerv Sudheer Vinukonda added a comment -

          Bryan Call :

          "It was hard to see the do while from the patch. I would then put a conditional around most of the code at the top (SSLConfigParams::ssl_maxrecord == -1) and comment it. I would like to avoid calling ink_get_hrtime_internal if the code is not going to use now."

          I agree - the do/while is not easy to see :=). Sure, will guard the ink_get_hrtime_internal with the config.

          "Comment the l -= (l % SSL_MAX_TLS_RECORD_SIZE); since 2 of us didn't know what you were doing."

          Sure, will try to add some comments about this twisted logic :=)

          "sslTotalBytesSent should be incremented by r not total_wrote. The value would be incremented by too many bytes."

          umm, yes - I think I meant to increment sslTotalBytesSent with total_wrote outside the do/while, but, messed up. I will fix this.

          " indentation is sometimes 4 spaces and sometimes 2 spaces"

          Hmm..will double check and fix.

          Thanks for the review and good suggestions :=)

          Show
          sudheerv Sudheer Vinukonda added a comment - Bryan Call : "It was hard to see the do while from the patch. I would then put a conditional around most of the code at the top (SSLConfigParams::ssl_maxrecord == -1) and comment it. I would like to avoid calling ink_get_hrtime_internal if the code is not going to use now." I agree - the do/while is not easy to see :=). Sure, will guard the ink_get_hrtime_internal with the config. "Comment the l -= (l % SSL_MAX_TLS_RECORD_SIZE); since 2 of us didn't know what you were doing." Sure, will try to add some comments about this twisted logic :=) "sslTotalBytesSent should be incremented by r not total_wrote. The value would be incremented by too many bytes." umm, yes - I think I meant to increment sslTotalBytesSent with total_wrote outside the do/while, but, messed up. I will fix this. " indentation is sometimes 4 spaces and sometimes 2 spaces" Hmm..will double check and fix. Thanks for the review and good suggestions :=)
          Hide
          bcall Bryan Call added a comment - - edited

          1. It was hard to see the do while from the patch. I would then put a conditional around most of the code at the top (SSLConfigParams::ssl_maxrecord == -1) and comment it. I would like to avoid calling ink_get_hrtime_internal if the code is not going to use now.
          2. Comment the l -= (l % SSL_MAX_TLS_RECORD_SIZE); since 2 of us didn't know what you were doing.
          3. sslTotalBytesSent should be incremented by r not total_wrote. The value would be incremented by too many bytes.
          4. indentation is sometimes 4 spaces and sometimes 2 spaces

          Show
          bcall Bryan Call added a comment - - edited 1. It was hard to see the do while from the patch. I would then put a conditional around most of the code at the top (SSLConfigParams::ssl_maxrecord == -1) and comment it. I would like to avoid calling ink_get_hrtime_internal if the code is not going to use now. 2. Comment the l -= (l % SSL_MAX_TLS_RECORD_SIZE); since 2 of us didn't know what you were doing. 3. sslTotalBytesSent should be incremented by r not total_wrote. The value would be incremented by too many bytes. 4. indentation is sometimes 4 spaces and sometimes 2 spaces
          Hide
          sudheerv Sudheer Vinukonda added a comment - - edited

          " If I set a config called SSL_MAX_TLS_RECORD_SIZE I would expect that that's the largest flush I would ever get, not a single byte more and not a multiple of it. So by allowing flushes on multiples you're not honoring what the user would expect"

          Brian Geffon -

          Hmm, let me give an example.

          If the user sets the ssl.max_record =16K, this code would call one SSL_write with 16K bytes followed by one flush.

          If the user sets the ssl.max_record = 33K, this code would call two SSL_writes (one with 32K bytes followed by a second with 1K bytes) flushing each time. Ideally, we should call one SSL_write for 33K to honor the user config, but, to satisfy the record boundary flush, I break it at multiples of 16K (only when the config setting is greater than 16K).

          I do agree that, I am not entirely sure whether openSSL would treat one SSL_write with 32K vs two SSL_writes with 16K each identically, but, I hoped there will be some benefit in CPU by calling one SSL_write vs two SSL_writes.

          If this is too confusing, and you and Bryan Call both prefer to just always trim at 16K boundaries, I am fine with that :=)

          Show
          sudheerv Sudheer Vinukonda added a comment - - edited " If I set a config called SSL_MAX_TLS_RECORD_SIZE I would expect that that's the largest flush I would ever get, not a single byte more and not a multiple of it. So by allowing flushes on multiples you're not honoring what the user would expect" Brian Geffon - Hmm, let me give an example. If the user sets the ssl.max_record =16K, this code would call one SSL_write with 16K bytes followed by one flush. If the user sets the ssl.max_record = 33K, this code would call two SSL_writes (one with 32K bytes followed by a second with 1K bytes) flushing each time. Ideally, we should call one SSL_write for 33K to honor the user config, but, to satisfy the record boundary flush, I break it at multiples of 16K (only when the config setting is greater than 16K). I do agree that, I am not entirely sure whether openSSL would treat one SSL_write with 32K vs two SSL_writes with 16K each identically, but, I hoped there will be some benefit in CPU by calling one SSL_write vs two SSL_writes. If this is too confusing, and you and Bryan Call both prefer to just always trim at 16K boundaries, I am fine with that :=)
          Hide
          briang Brian Geffon added a comment -

          Other than these minor issues I think this patch is nice and concise, great work.

          Show
          briang Brian Geffon added a comment - Other than these minor issues I think this patch is nice and concise, great work.
          Hide
          briang Brian Geffon added a comment - - edited

          Sudheer Vinukonda, but you're not doing what the user would expect in either case. If I set a config called SSL_MAX_TLS_RECORD_SIZE I would expect that that's the largest flush I would ever get, not a single byte more and not a multiple of it. So by allowing flushes on multiples you're not honoring what the user would expect.

          Show
          briang Brian Geffon added a comment - - edited Sudheer Vinukonda , but you're not doing what the user would expect in either case. If I set a config called SSL_MAX_TLS_RECORD_SIZE I would expect that that's the largest flush I would ever get, not a single byte more and not a multiple of it. So by allowing flushes on multiples you're not honoring what the user would expect.
          Hide
          sudheerv Sudheer Vinukonda added a comment -

          "Doesn't make sense. The code as implemented will force the flush to be a multiple of SSL_MAX_TLS_RECORD_SIZE, which shouldn't be the case, otherwise these variable names are incredibly confusing. Were you intending to flush on multiples of SSL_MAX_TLS_RECORD size? "

          Brian Geffon - Thanks! The reason I added this is to honor the user's configuration of ssl_max_record_size (if it is higher than 16K). The requirement is to flush the TLS records at record boundaries which should still work. openSSL should make it N TLS records of 16K each and flush at the end. If I set it to 16K, I would end up calling SSL_write() multiple times (not honoring the user config setting) which would also result in N TLS records but, flushed N times.

          Show
          sudheerv Sudheer Vinukonda added a comment - "Doesn't make sense. The code as implemented will force the flush to be a multiple of SSL_MAX_TLS_RECORD_SIZE, which shouldn't be the case, otherwise these variable names are incredibly confusing. Were you intending to flush on multiples of SSL_MAX_TLS_RECORD size? " Brian Geffon - Thanks! The reason I added this is to honor the user's configuration of ssl_max_record_size (if it is higher than 16K). The requirement is to flush the TLS records at record boundaries which should still work. openSSL should make it N TLS records of 16K each and flush at the end. If I set it to 16K, I would end up calling SSL_write() multiple times (not honoring the user config setting) which would also result in N TLS records but, flushed N times.
          Hide
          sudheerv Sudheer Vinukonda added a comment - - edited

          Bryan Call : Thanks for the review. Some initial responses:

          "I would set ink_hrtime now = 0 at the top and move most of the code that is for dynamic tls record size into the else if. There is no reason to do the work if nothing else is going to use it."

          The idea is to track/count the number of bytes transferred within a given connection, which could call load_buffer_and write multiple times (so, now can not be set to "0" at the top).

          Further, I wanted to avoid calling ink_get_hrtime_internal() and ink_hrtime_to_msec() multiple times in the do/while loop, which is why I've separated out the computation of dynamic_tls_record_size outside of the loop.

          "didn't understand:
          + if (l > SSL_MAX_TLS_RECORD_SIZE)

          { + l -= (l % SSL_MAX_TLS_RECORD_SIZE); + }

          "

          This is to basically hand multiples of SSL_MAX_TLS_RECORD_SIZE (if configured that way) to openSSL via one SSL_write() call, which will automatically partition it based on max TLS record size anyways.

          "In line 100 in the patch you shouldn't have to check for || msec_since_last_write > SSL_DEF_TLS_RECORD_MSEC_THRESHOLD since the size would be set to 0 line 82 in the patch based on the same conditional."

          ah, agree - will remove this redundant check. :=)

          Show
          sudheerv Sudheer Vinukonda added a comment - - edited Bryan Call : Thanks for the review. Some initial responses: "I would set ink_hrtime now = 0 at the top and move most of the code that is for dynamic tls record size into the else if. There is no reason to do the work if nothing else is going to use it." The idea is to track/count the number of bytes transferred within a given connection, which could call load_buffer_and write multiple times (so, now can not be set to "0" at the top). Further, I wanted to avoid calling ink_get_hrtime_internal() and ink_hrtime_to_msec() multiple times in the do/while loop, which is why I've separated out the computation of dynamic_tls_record_size outside of the loop. "didn't understand: + if (l > SSL_MAX_TLS_RECORD_SIZE) { + l -= (l % SSL_MAX_TLS_RECORD_SIZE); + } " This is to basically hand multiples of SSL_MAX_TLS_RECORD_SIZE (if configured that way) to openSSL via one SSL_write() call, which will automatically partition it based on max TLS record size anyways. "In line 100 in the patch you shouldn't have to check for || msec_since_last_write > SSL_DEF_TLS_RECORD_MSEC_THRESHOLD since the size would be set to 0 line 82 in the patch based on the same conditional." ah, agree - will remove this redundant check. :=)
          Hide
          briang Brian Geffon added a comment - - edited
          +      if (l > SSL_MAX_TLS_RECORD_SIZE) {
          +         l -= (l % SSL_MAX_TLS_RECORD_SIZE);
          +      }
          

          Doesn't make sense. The code as implemented will force the flush to be a multiple of SSL_MAX_TLS_RECORD_SIZE, which shouldn't be the case, otherwise these variable names are incredibly confusing. Were you intending to flush on multiples of SSL_MAX_TLS_RECORD size? As Bryan Call suggests you should just do:

           if (l > SSL_MAX_TLS_RECORD_SIZE) l = SSL_MAX_TLS_RECORD_SIZE; 

          Other than that I think this looks good.

          Show
          briang Brian Geffon added a comment - - edited + if (l > SSL_MAX_TLS_RECORD_SIZE) { + l -= (l % SSL_MAX_TLS_RECORD_SIZE); + } Doesn't make sense. The code as implemented will force the flush to be a multiple of SSL_MAX_TLS_RECORD_SIZE, which shouldn't be the case, otherwise these variable names are incredibly confusing. Were you intending to flush on multiples of SSL_MAX_TLS_RECORD size? As Bryan Call suggests you should just do: if (l > SSL_MAX_TLS_RECORD_SIZE) l = SSL_MAX_TLS_RECORD_SIZE; Other than that I think this looks good.
          Hide
          bcall Bryan Call added a comment - - edited

          1. dynamic_tls_record_size can be scoped within the else if for max record == -1
          2. I would set ink_hrtime now = 0 at the top and move most of the code that is for dynamic tls record size into the else if. There is no reason to do the work if nothing else is going to use it.
          3. didn't understand:
          + if (l > SSL_MAX_TLS_RECORD_SIZE)

          { + l -= (l % SSL_MAX_TLS_RECORD_SIZE); + }

          Why not just set l equal to SSL_MAX_TLS_RECORD_SIZE?
          4. In line 100 in the patch you shouldn't have to check for || msec_since_last_write > SSL_DEF_TLS_RECORD_MSEC_THRESHOLD since the size would be set to 0 line 82 in the patch based on the same conditional.

          Show
          bcall Bryan Call added a comment - - edited 1. dynamic_tls_record_size can be scoped within the else if for max record == -1 2. I would set ink_hrtime now = 0 at the top and move most of the code that is for dynamic tls record size into the else if. There is no reason to do the work if nothing else is going to use it. 3. didn't understand: + if (l > SSL_MAX_TLS_RECORD_SIZE) { + l -= (l % SSL_MAX_TLS_RECORD_SIZE); + } Why not just set l equal to SSL_MAX_TLS_RECORD_SIZE? 4. In line 100 in the patch you shouldn't have to check for || msec_since_last_write > SSL_DEF_TLS_RECORD_MSEC_THRESHOLD since the size would be set to 0 line 82 in the patch based on the same conditional.
          Hide
          sudheerv Sudheer Vinukonda added a comment -

          updated the docs for configuring dynamic TLS record sizing via the setting proxy.config.ssl.max_record.

          Show
          sudheerv Sudheer Vinukonda added a comment - updated the docs for configuring dynamic TLS record sizing via the setting proxy.config.ssl.max_record.
          Hide
          sudheerv Sudheer Vinukonda added a comment -

          Thank you so much, Ilya Grigorik :=)

          Attached is a patch that implements Ilya Grigorik's suggestions for dynamic TLS record sizing.

          Show
          sudheerv Sudheer Vinukonda added a comment - Thank you so much, Ilya Grigorik :=) Attached is a patch that implements Ilya Grigorik 's suggestions for dynamic TLS record sizing.
          Hide
          igrigorik Ilya Grigorik added a comment -

          Sudheer Vinukonda: For smaller records, we should also reserve space for various TCP options (timestamps, SACKs.. up to 40 bytes [1]), and account for TLS record overhead (another 20-60 bytes on average, depending on the negotiated ciphersuite [2]). All in all: 1500 - 40 (IP) - 20 (TCP) - 40 (TCP options) - TLS overhead (60-100) ~= 1300 bytes. If you inspect records emitted by Google servers, you'll see that they carry ~1300 bytes of application data due to the math above.

          Also, worth nothing that we should make sure that packets are flushed at record boundaries.. Otherwise you quickly end up packing multiple (small) records across packets.

          Finally, 1MB + 1s values is what we use on Google servers and with good results... lgtm.

          [1] http://chimera.labs.oreilly.com/books/1230000000545/ch04.html#TLS_RECORD_SIZE
          [2] https://www.evernote.com/shard/s1/sh/a44a8ee5-d788-44c7-a3ad-6e505f99d592/00e0b141eef44288f338332085044e6a (courtesy of Ivan Ristic's excellent Bulletproof SSL and TLS book)

          Show
          igrigorik Ilya Grigorik added a comment - Sudheer Vinukonda : For smaller records, we should also reserve space for various TCP options (timestamps, SACKs.. up to 40 bytes [1] ), and account for TLS record overhead (another 20-60 bytes on average, depending on the negotiated ciphersuite [2] ). All in all: 1500 - 40 (IP) - 20 (TCP) - 40 (TCP options) - TLS overhead (60-100) ~= 1300 bytes. If you inspect records emitted by Google servers, you'll see that they carry ~1300 bytes of application data due to the math above. Also, worth nothing that we should make sure that packets are flushed at record boundaries.. Otherwise you quickly end up packing multiple (small) records across packets. Finally, 1MB + 1s values is what we use on Google servers and with good results... lgtm. [1] http://chimera.labs.oreilly.com/books/1230000000545/ch04.html#TLS_RECORD_SIZE [2] https://www.evernote.com/shard/s1/sh/a44a8ee5-d788-44c7-a3ad-6e505f99d592/00e0b141eef44288f338332085044e6a (courtesy of Ivan Ristic's excellent Bulletproof SSL and TLS book)
          Hide
          sudheerv Sudheer Vinukonda added a comment - - edited

          Discussed with the stalwarts on IRC and here's the consensus (quote from Thomas Jackson):

          "use small TLS records that fit into a single TCP segment for the first ~1 MB of data, increase record size to 16 KB after that to optimize throughput, and then reset record size back to a single segment after ~1 second of inactivity—lather, rinse, repeat."

          The recommendation from Leif Hedstrom, Thomas Jackson, Brian Geffon etc is to use the below values:

          smaller TLS record size: MTU/MSS (1500) minus the TCP (20 bytes) and IP (40 bytes) overheads: 1500 - 40 - 20 = 1440 bytes
          larger TLS record size: maximum TLS record size which is 16383 (2^14 - 1)

          Ilya Grigorik : Hi, it seems that you are on the "watch-list" on this jira. Can you please review and provide any suggestions/feedback you may have on the above proposal/values.

          Show
          sudheerv Sudheer Vinukonda added a comment - - edited Discussed with the stalwarts on IRC and here's the consensus (quote from Thomas Jackson ): "use small TLS records that fit into a single TCP segment for the first ~1 MB of data, increase record size to 16 KB after that to optimize throughput, and then reset record size back to a single segment after ~1 second of inactivity—lather, rinse, repeat." The recommendation from Leif Hedstrom , Thomas Jackson , Brian Geffon etc is to use the below values: smaller TLS record size: MTU/MSS (1500) minus the TCP (20 bytes) and IP (40 bytes) overheads: 1500 - 40 - 20 = 1440 bytes larger TLS record size: maximum TLS record size which is 16383 (2^14 - 1) Ilya Grigorik : Hi, it seems that you are on the "watch-list" on this jira. Can you please review and provide any suggestions/feedback you may have on the above proposal/values.

            People

            • Assignee:
              sudheerv Sudheer Vinukonda
              Reporter:
              jamespeach James Peach
            • Votes:
              2 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development