Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-3378

Allow configuration of storage protocol socket buffer

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Fix Version/s: 1.2.2
    • Component/s: None
    • Labels:

      Description

      Similar to rpc_[send,recv]_buff_size_in_bytes, we should expose this for high latency connections.

      1. cassandra-3378-v2.patch
        5 kB
        Michał Michalski
      2. cassandra-3378-v1.patch
        4 kB
        Michał Michalski
      3. 3378-v4.patch
        4 kB
        Michał Michalski
      4. 3378-v3.patch
        5 kB
        Jason Brown

        Activity

        Hide
        brandon.williams Brandon Williams added a comment -

        Committed v4. Thanks!

        Show
        brandon.williams Brandon Williams added a comment - Committed v4. Thanks!
        Hide
        michalm Michał Michalski added a comment - - edited

        I agree, for me internode_(send|recv)_buff_size_in_bytes is much better / descriptive name too. Attaching updated patch.

        Show
        michalm Michał Michalski added a comment - - edited I agree, for me internode_(send|recv)_buff_size_in_bytes is much better / descriptive name too. Attaching updated patch.
        Hide
        jasobrown Jason Brown added a comment -

        Marcus, yeah, that's a useful modification . +1 on renaming the yaml property name (and the downstream changes changes in Config/DD).

        Show
        jasobrown Jason Brown added a comment - Marcus, yeah, that's a useful modification . +1 on renaming the yaml property name (and the downstream changes changes in Config/DD).
        Hide
        krummas Marcus Eriksson added a comment -

        could we perhaps call the config parameter "internode_(send|recv)_buff_size_in_bytes" instead? makes it a bit clearer (atleast to me) what we are actually configuring

        other than that, lgtm

        Show
        krummas Marcus Eriksson added a comment - could we perhaps call the config parameter "internode_(send|recv)_buff_size_in_bytes" instead? makes it a bit clearer (atleast to me) what we are actually configuring other than that, lgtm
        Hide
        jasobrown Jason Brown added a comment -

        Minor cleanup of Michal's patch. Rebased against current 1.2 branch. Restored the Buffered*outStream size to 4096 and added yaml comment.

        Show
        jasobrown Jason Brown added a comment - Minor cleanup of Michal's patch. Rebased against current 1.2 branch. Restored the Buffered*outStream size to 4096 and added yaml comment.
        Hide
        jasobrown Jason Brown added a comment -

        Looks like snappy does it's own internal buffering (for compressing streams). SnappyOutputStream has an 'uncompressed' buffer of 32k, which flushes to it's output stream when full or flush() is called. SnappyInputStream creates a buffer based upon the size of the chunk size of the current block. Thus, we shouldn't need to worry about having Buffered*putStream for compressed streams. We could play with the buffer size there (4k vs. 8k), but I think the biggest win is just having any buffer before the socket output stream (which doesn't buffer at all). I'm in favor of leaving the default 4096 buffer size for Buffered*putStream on uncompressed streams until there is sufficient evidence/testing to switch.

        As for the socket's buffer config, we can use Michal's new yaml settings for the in/out sockets. However, I think storage_send_buff_size_in_bytes should be commented out by default in the yaml.

        Show
        jasobrown Jason Brown added a comment - Looks like snappy does it's own internal buffering (for compressing streams). SnappyOutputStream has an 'uncompressed' buffer of 32k, which flushes to it's output stream when full or flush() is called. SnappyInputStream creates a buffer based upon the size of the chunk size of the current block. Thus, we shouldn't need to worry about having Buffered*putStream for compressed streams. We could play with the buffer size there (4k vs. 8k), but I think the biggest win is just having any buffer before the socket output stream (which doesn't buffer at all). I'm in favor of leaving the default 4096 buffer size for Buffered*putStream on uncompressed streams until there is sufficient evidence/testing to switch. As for the socket's buffer config, we can use Michal's new yaml settings for the in/out sockets. However, I think storage_send_buff_size_in_bytes should be commented out by default in the yaml.
        Hide
        michalm Michał Michalski added a comment -

        Thanks for reviewing, Brandon.

        Changes in v2:
        1) Remove receive buffer setting from OutboundTcpConnection - as it's being used for Messaging purposes only and not really needed there
        2) Set receive buffer (only) for IncommingTcpConnection. My concern is the way of changing this parameter in a kind of "wide" way - not only for streaming (for which it makes sense for me), but much earlier, so it "affects" messaging parts of code too. I'm unsure of the impact it may have on the messaging service in general - I don't have that much experience with Cassandra & network programming to judge it "just like that".

        Show
        michalm Michał Michalski added a comment - Thanks for reviewing, Brandon. Changes in v2: 1) Remove receive buffer setting from OutboundTcpConnection - as it's being used for Messaging purposes only and not really needed there 2) Set receive buffer (only) for IncommingTcpConnection. My concern is the way of changing this parameter in a kind of "wide" way - not only for streaming (for which it makes sense for me), but much earlier, so it "affects" messaging parts of code too. I'm unsure of the impact it may have on the messaging service in general - I don't have that much experience with Cassandra & network programming to judge it "just like that".
        Hide
        brandon.williams Brandon Williams added a comment -

        You actually need to set the receive buffer size in IncomingTcpConnection. It sounds like the lack of a buffer for compression is either an oversight or is buffered somewhere deeper in the stack. I'll run some tests when we're ready.

        Show
        brandon.williams Brandon Williams added a comment - You actually need to set the receive buffer size in IncomingTcpConnection. It sounds like the lack of a buffer for compression is either an oversight or is buffered somewhere deeper in the stack. I'll run some tests when we're ready.
        Hide
        michalm Michał Michalski added a comment -

        I've attached an initial patch. As I mentioned I had a little concern on what to do with the default buffer size that is set for BufferedOutputStream and why is it set only for uncompressed OutputStream. I've read CASSANDRA-1896 and I understand why it's set to 4096, but I can only guess why it's not set for compressed stream too. Anyway - I'm pretty sure that user settings should overwrite this default 4096, but I wonder if default of 4096 should stay in code (so default buffer size values for compressed and uncompressed streams are handled separately) or maybe it should be set by default in cassandra.yaml. However, second solution (which I picked for now) causes compressed output stream to have default buffer size of 4096 bytes too, so it leads back to my first concern: is this a behaviour we expect?

        Waiting for a comment and I'll improve this patch according to your suggestion.

        Show
        michalm Michał Michalski added a comment - I've attached an initial patch. As I mentioned I had a little concern on what to do with the default buffer size that is set for BufferedOutputStream and why is it set only for uncompressed OutputStream. I've read CASSANDRA-1896 and I understand why it's set to 4096, but I can only guess why it's not set for compressed stream too. Anyway - I'm pretty sure that user settings should overwrite this default 4096, but I wonder if default of 4096 should stay in code (so default buffer size values for compressed and uncompressed streams are handled separately) or maybe it should be set by default in cassandra.yaml. However, second solution (which I picked for now) causes compressed output stream to have default buffer size of 4096 bytes too, so it leads back to my first concern: is this a behaviour we expect? Waiting for a comment and I'll improve this patch according to your suggestion.
        Hide
        michalm Michał Michalski added a comment -

        Just pondering... In the existing code there's a BufferedOutputStream with buffer size of 4096 bytes at first, which can be then "overwritten" by a stream with default buffer size (no specified size is passed to constructor) if compression is enabled.
        My concern is: if there's a reason for setting different buffer size for compressed and non-compressed connection by default, is this valid to apply the same value (provided by user in config) for both connection types?
        Shouldn't they be distinguished on a config level too? For example by "splitting" storage_send_buff_size_bytes (that's how I temporarily named this new param) into two options: storage_uncompressed_send_buff_size_bytes and storage_compressed_send_buff_size_bytes?
        Or maybe we assume that for this special environment (high latency connection) the performance gain achieved by using larger buffer is big enough to just ignore the case I mentioned and use the same value in these two places in code?

        Show
        michalm Michał Michalski added a comment - Just pondering... In the existing code there's a BufferedOutputStream with buffer size of 4096 bytes at first, which can be then "overwritten" by a stream with default buffer size (no specified size is passed to constructor) if compression is enabled. My concern is: if there's a reason for setting different buffer size for compressed and non-compressed connection by default, is this valid to apply the same value (provided by user in config) for both connection types? Shouldn't they be distinguished on a config level too? For example by "splitting" storage_send_buff_size_bytes (that's how I temporarily named this new param) into two options: storage_uncompressed_send_buff_size_bytes and storage_compressed_send_buff_size_bytes? Or maybe we assume that for this special environment (high latency connection) the performance gain achieved by using larger buffer is big enough to just ignore the case I mentioned and use the same value in these two places in code?
        Hide
        michalm Michał Michalski added a comment -

        Thanks
        I'll try to find some time soon and submit a patch (hopefully) by the end of the week.

        Show
        michalm Michał Michalski added a comment - Thanks I'll try to find some time soon and submit a patch (hopefully) by the end of the week.
        Hide
        brandon.williams Brandon Williams added a comment -

        "storage protocol" is the internode communication protocol Cassandra uses, which is completely custom (ie, not thrift-based or anything like that.) OutboundTcpConnection is the class you need to modify to set the socket buffer.

        Show
        brandon.williams Brandon Williams added a comment - "storage protocol" is the internode communication protocol Cassandra uses, which is completely custom (ie, not thrift-based or anything like that.) OutboundTcpConnection is the class you need to modify to set the socket buffer.
        Hide
        michalm Michał Michalski added a comment -

        I started to investigate this task (lfh label sounds like a good point to get deeper into Cassandra architecture ), but as I'm new to Cassandra internal architecture, I'm not really sure if I understand the subject of this task. To be more precise - I have a problem with understanding what does "storage protocol" really refers to in this case?

        At first I thought it refers to Thrift (and ThriftServer started by CassandraDaemon), but when I was investigating this issue (by checking how "looking at rpc_[send,recv]_buff_size_in_bytes" setting is set and where) I discovered that ThriftServer already uses the setting mentioned in this task.

        My second idea was "native server" (org.apache.cassandra.transport.Server) which I found in CassandraDaemon too. It uses ServerBootstrap created using NioServerSocketChannelFactory and it seems possible to add such setting there, but I wonder if this is really the service this issue tells about? Or maybe I'm completely wrong? Could you give me some clue?

        Sorry if it's a silly question, but I'm confused

        BTW: Is there maybe a doc that tells a bit more on how Cassandra works in terms of "internal" architecture/dependencies/protocols etc.?

        Show
        michalm Michał Michalski added a comment - I started to investigate this task (lfh label sounds like a good point to get deeper into Cassandra architecture ), but as I'm new to Cassandra internal architecture, I'm not really sure if I understand the subject of this task. To be more precise - I have a problem with understanding what does "storage protocol" really refers to in this case? At first I thought it refers to Thrift (and ThriftServer started by CassandraDaemon), but when I was investigating this issue (by checking how "looking at rpc_ [send,recv] _buff_size_in_bytes" setting is set and where) I discovered that ThriftServer already uses the setting mentioned in this task. My second idea was "native server" (org.apache.cassandra.transport.Server) which I found in CassandraDaemon too. It uses ServerBootstrap created using NioServerSocketChannelFactory and it seems possible to add such setting there, but I wonder if this is really the service this issue tells about? Or maybe I'm completely wrong? Could you give me some clue? Sorry if it's a silly question, but I'm confused BTW: Is there maybe a doc that tells a bit more on how Cassandra works in terms of "internal" architecture/dependencies/protocols etc.?
        Hide
        lenn0x Chris Goffinet added a comment - - edited

        Assigning to Harish, starting him out with something small as his first patch.

        Show
        lenn0x Chris Goffinet added a comment - - edited Assigning to Harish, starting him out with something small as his first patch.

          People

          • Assignee:
            michalm Michał Michalski
            Reporter:
            brandon.williams Brandon Williams
            Reviewer:
            Jason Brown
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development