Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 7.1.0
    • Network, SSL
    • None

    Description

      UnixNetVConnection does a vectored write which bunches blocks together until the outgoing buffer will fill up. This means we can better fill the packets, and should give us a bit of a performance boost to SSL writes.

      Attachments

        Activity

          jacksontj Thomas Jackson created issue -
          jacksontj Thomas Jackson made changes -
          Field Original Value New Value
          Summary SSLNetVConnection writes each block separately to the socket SSLNetVConnection: switch to a vectored write
          briang Brian Geffon added a comment -

          UnixNetVConnection uses writev but OpenSSL only has SSL_write it doesn't have an SSL_writev.

          briang Brian Geffon added a comment - UnixNetVConnection uses writev but OpenSSL only has SSL_write it doesn't have an SSL_writev.
          briang Brian Geffon added a comment - - edited

          After investigation with Thomas Jackson it appears that SSLNetVConnection::load_buffer_and_write is called with all of the data in multiple IOBufferBlock so it results in multiple calls to SSLWriteBuffer. To experiment we created a small patch that just copies all of the IOBufferBlocks to a single block of memory followed by a single SSLWriteBuffer call and it resolves the problem. So since this is obviously not a viable solution we need to figure out why the SSL_write calls are being split up in such a strange way.

          We had a second theory that was related to the BIO flush calls, so we made a patch that would only call BIO flush once all the SSL_writes are complete and that did not seem to resolve the problem.

          We should be able to call SSL_write multiple times without having it delay the write to socket for such a large amount of time.

          briang Brian Geffon added a comment - - edited After investigation with Thomas Jackson it appears that SSLNetVConnection::load_buffer_and_write is called with all of the data in multiple IOBufferBlock so it results in multiple calls to SSLWriteBuffer . To experiment we created a small patch that just copies all of the IOBufferBlocks to a single block of memory followed by a single SSLWriteBuffer call and it resolves the problem. So since this is obviously not a viable solution we need to figure out why the SSL_write calls are being split up in such a strange way. We had a second theory that was related to the BIO flush calls, so we made a patch that would only call BIO flush once all the SSL_writes are complete and that did not seem to resolve the problem. We should be able to call SSL_write multiple times without having it delay the write to socket for such a large amount of time.
          briang Brian Geffon added a comment -

          Susan Hinrichs do you have any thoughts why two calls to SSL_write that are basically back-to-back would result in such a large delay between data going over the wire?

          briang Brian Geffon added a comment - Susan Hinrichs do you have any thoughts why two calls to SSL_write that are basically back-to-back would result in such a large delay between data going over the wire?
          shinrich Susan Hinrichs added a comment -

          Fiddling with the nagle times as Sudheer Vinukonda suggested might also help.

          May ultimately need to not use SSL_write but do BIO stuff directly to get the control we need.

          From chatting with Thomas Jackson last night, it wasn't clear that the separate writes were really causing the delay. It sounded like the delays only occurred if the box was in rotation (under some load). But I see that you guys have done some more experiments since then. If you could share a pcap file of a pathological case, that would be great.

          Also Dave Thompson might have some insights.

          shinrich Susan Hinrichs added a comment - Fiddling with the nagle times as Sudheer Vinukonda suggested might also help. May ultimately need to not use SSL_write but do BIO stuff directly to get the control we need. From chatting with Thomas Jackson last night, it wasn't clear that the separate writes were really causing the delay. It sounded like the delays only occurred if the box was in rotation (under some load). But I see that you guys have done some more experiments since then. If you could share a pcap file of a pathological case, that would be great. Also Dave Thompson might have some insights.
          shinrich Susan Hinrichs added a comment -

          I think I see why the system isn't coalescing SSL_write blocks. We call SSL_write from SSLWriteBuffer() in SSLUtils.c After the SSL_write, we call BIO_flush, which I assume will force out a packet.

          Instead, we should move the BIO_flush call to the end of SSLNetVConnection::load_buffer_and_write() after all the queued up blocks have been written via SSLWriteBuffer.

          I'll experiment on my own with this, but since Thomas Jackson can exercise this problem at will, thought you might want to experiment in parallel.

          shinrich Susan Hinrichs added a comment - I think I see why the system isn't coalescing SSL_write blocks. We call SSL_write from SSLWriteBuffer() in SSLUtils.c After the SSL_write, we call BIO_flush, which I assume will force out a packet. Instead, we should move the BIO_flush call to the end of SSLNetVConnection::load_buffer_and_write() after all the queued up blocks have been written via SSLWriteBuffer. I'll experiment on my own with this, but since Thomas Jackson can exercise this problem at will, thought you might want to experiment in parallel.
          briang Brian Geffon added a comment -

          Susan Hinrichs That's actually the first thing I tried. I removed the BIO_flush and only called it after all SSL_writes were completed and we still observed the problem.

          briang Brian Geffon added a comment - Susan Hinrichs That's actually the first thing I tried. I removed the BIO_flush and only called it after all SSL_writes were completed and we still observed the problem.
          briang Brian Geffon added a comment -

          After further investigation and reading: https://www.openssl.org/docs/crypto/BIO_s_connect.html#EXAMPLE , it appears we shouldn't be mixing BIOs and SSL_write / SSL_reads, I'm thinking about putting a patch together that uses BIOs only. Susan Hinrichs , any thoughts on this?

          briang Brian Geffon added a comment - After further investigation and reading: https://www.openssl.org/docs/crypto/BIO_s_connect.html#EXAMPLE , it appears we shouldn't be mixing BIOs and SSL_write / SSL_reads, I'm thinking about putting a patch together that uses BIOs only. Susan Hinrichs , any thoughts on this?
          briang Brian Geffon made changes -
          Assignee Brian Geffon [ briang ]
          zwoop Leif Hedstrom made changes -
          Component/s Network [ 12313122 ]
          Component/s SSL [ 12313766 ]
          zwoop Leif Hedstrom made changes -
          Fix Version/s 6.0.0 [ 12324897 ]
          shinrich Susan Hinrichs added a comment -

          Agreed that switching to BIO's entirely would make sense. It gives us control of our own buffering if nothing else.

          We currently use mem BIO's during the server side of the handshake, so we can decide after the CLIENT HELLO to forward on for a blind tunnel. But after the handshake is done we flip back to a socket BIO. We still use the standard SSL_accept even when using the mem bio.

          shinrich Susan Hinrichs added a comment - Agreed that switching to BIO's entirely would make sense. It gives us control of our own buffering if nothing else. We currently use mem BIO's during the server side of the handshake, so we can decide after the CLIENT HELLO to forward on for a blind tunnel. But after the handshake is done we flip back to a socket BIO. We still use the standard SSL_accept even when using the mem bio.
          briang Brian Geffon added a comment -

          Susan Hinrichs, as I'm fairly new to these code paths, would you mind just pointing me to a few key locations so I can get started with the patch? When you say we use mem BIOs during the handhsake and then switch to SSL_accept, that means we're no longer using the BIOs? I don;t fully understand what you mean by that...

          briang Brian Geffon added a comment - Susan Hinrichs , as I'm fairly new to these code paths, would you mind just pointing me to a few key locations so I can get started with the patch? When you say we use mem BIOs during the handhsake and then switch to SSL_accept, that means we're no longer using the BIOs? I don;t fully understand what you mean by that...
          shinrich Susan Hinrichs added a comment -

          Well we are always using BIOs. They are the underpinnings of the SSL_write/SSL_read calls too. Mostly we using socket BIOs. For example in the make_ssl_connection function, we set up a file descriptor BIO associated with the socket and a memory buffer BIO. Then those BIOs are bound to the ssl object. We then just use the standard SSL_read/SSL_write/SSL_accept calls.

          I think you do need to call the SSL_* functions to do the appropriate encryption. Look at the BIO_s_connect example you reference a couple comments up, I think they are just doing a HTTP exchange.

          By replacing the socket BIO with a buffer BIO, you should be able to separate the encryption/decryption from the communication. That is what we do on reading with the mem BIO. I think you can do something similar with the write side, but I haven't dug into that. Check out https://www.openssl.org/docs/crypto/BIO_s_mem.html

          shinrich Susan Hinrichs added a comment - Well we are always using BIOs. They are the underpinnings of the SSL_write/SSL_read calls too. Mostly we using socket BIOs. For example in the make_ssl_connection function, we set up a file descriptor BIO associated with the socket and a memory buffer BIO. Then those BIOs are bound to the ssl object. We then just use the standard SSL_read/SSL_write/SSL_accept calls. I think you do need to call the SSL_* functions to do the appropriate encryption. Look at the BIO_s_connect example you reference a couple comments up, I think they are just doing a HTTP exchange. By replacing the socket BIO with a buffer BIO, you should be able to separate the encryption/decryption from the communication. That is what we do on reading with the mem BIO. I think you can do something similar with the write side, but I haven't dug into that. Check out https://www.openssl.org/docs/crypto/BIO_s_mem.html
          zwoop Leif Hedstrom made changes -
          Fix Version/s 6.1.0 [ 12324898 ]
          Fix Version/s 6.0.0 [ 12324897 ]
          zwoop Leif Hedstrom made changes -
          Assignee Brian Geffon [ briang ] Susan Hinrichs [ shinrich ]
          zwoop Leif Hedstrom made changes -
          Fix Version/s 7.0.0 [ 12325951 ]
          Fix Version/s 6.1.0 [ 12324898 ]
          bcall Bryan Call made changes -
          Fix Version/s sometime [ 12316277 ]
          Fix Version/s 7.0.0 [ 12325951 ]
          bcall Bryan Call made changes -
          Fix Version/s 7.1.0 [ 12337879 ]
          Fix Version/s sometime [ 12316277 ]

          People

            shinrich Susan Hinrichs
            jacksontj Thomas Jackson

            Dates

              Created:
              Updated:

              Slack

                Issue deployment