Cassandra
  1. Cassandra
  2. CASSANDRA-3333

remove more copies from read/write network path

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.0.1
    • Component/s: Core
    • Labels:

      Description

      RowMutation.serializedBuffer and ReadVerbHandler both do an extra copy of the serialized data. We can avoid this be pre-computing the serialized size and allocating an appropriate buffer.

        Activity

        Jonathan Ellis created issue -
        Hide
        Jonathan Ellis added a comment -

        01 cleans up the serializer snafu we've been living with
        02 actually does the optimization

        Prior to this patch we had three generic serializers:

        ICompactSerializer, which is used by Messages, and whose de/serialize methods have version parameters

        ICompactSerializer2, which does not have versions and whose methods have DataInput/Output parameters instead of Input/OutputStreams. (The former is a superset of the later.)

        ICompactSerializer3, which extends ICS2 and adds serializedSize.

        To do this optimization I'd need ICS4, since I want serializedSize AND versioning.

        Instead, I've replaced the three old interfaces with just ISerializer and IVersionedSerializer. Both use DataInput/Output, and both have a serializedSize method; the difference is that the former does not have a version parameter. Implementers that don't care about precomputing size simply implement as UnsupportedOperation.

        A few serializers (in LegacyBloomFilter and MerkleTree) still needed a Stream parameter because they are doing JDK serialization under the hood. None of that code actually cares about using a generic Serializer interface, so I just made them one-off classes and everything was happy.

        Show
        Jonathan Ellis added a comment - 01 cleans up the serializer snafu we've been living with 02 actually does the optimization Prior to this patch we had three generic serializers: ICompactSerializer, which is used by Messages, and whose de/serialize methods have version parameters ICompactSerializer2, which does not have versions and whose methods have DataInput/Output parameters instead of Input/OutputStreams. (The former is a superset of the later.) ICompactSerializer3, which extends ICS2 and adds serializedSize. To do this optimization I'd need ICS4, since I want serializedSize AND versioning. Instead, I've replaced the three old interfaces with just ISerializer and IVersionedSerializer. Both use DataInput/Output, and both have a serializedSize method; the difference is that the former does not have a version parameter. Implementers that don't care about precomputing size simply implement as UnsupportedOperation. A few serializers (in LegacyBloomFilter and MerkleTree) still needed a Stream parameter because they are doing JDK serialization under the hood. None of that code actually cares about using a generic Serializer interface, so I just made them one-off classes and everything was happy.
        Jonathan Ellis made changes -
        Field Original Value New Value
        Attachment 0001-clean-up-Serializer-mess.patch [ 12498227 ]
        Attachment 0002-reduce-copies.patch [ 12498228 ]
        Jonathan Ellis made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Labels network
        Reviewer brandon.williams
        Hide
        Brandon Williams added a comment -

        +1

        Show
        Brandon Williams added a comment - +1
        Hide
        Jonathan Ellis added a comment -

        New version of patch 2, covering the rest of the serializations on the read and write path. Extracted serialize-and-assert-it-matched-claimed-size code into FBUtilities.serialize.

        Show
        Jonathan Ellis added a comment - New version of patch 2, covering the rest of the serializations on the read and write path. Extracted serialize-and-assert-it-matched-claimed-size code into FBUtilities.serialize.
        Jonathan Ellis made changes -
        Attachment 0002-introduce-FBUtilities.serialize-to-reduce-copies.patch [ 12498387 ]
        Jonathan Ellis made changes -
        Attachment 0002-introduce-FBUtilities.serialize-to-reduce-copies.patch [ 12498387 ]
        Jonathan Ellis made changes -
        Attachment 0002-introduce-FBUtilities.serialize-to-reduce-copies.patch [ 12498388 ]
        Hide
        Brandon Williams added a comment -

        Doesn't quite work:

         INFO 08:19:01,875 Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1318339141875.log
        ERROR 08:19:01,884 Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main]
        java.lang.AssertionError
                at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:782)
                at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:278)
                at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122)
                at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:605)
                at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
                at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
                at java.lang.Thread.run(Thread.java:662)
        
        Show
        Brandon Williams added a comment - Doesn't quite work: INFO 08:19:01,875 Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1318339141875.log ERROR 08:19:01,884 Fatal exception in thread Thread[COMMIT-LOG-WRITER,5,main] java.lang.AssertionError at org.apache.cassandra.utils.FBUtilities.serialize(FBUtilities.java:782) at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:278) at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:122) at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:605) at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.lang.Thread.run(Thread.java:662)
        Jonathan Ellis made changes -
        Attachment 0002-introduce-FBUtilities.serialize-to-reduce-copies.patch [ 12498388 ]
        Jonathan Ellis made changes -
        Attachment 0002-reduce-copies.patch [ 12498228 ]
        Hide
        Jonathan Ellis added a comment -

        updated 02 w/ fixes

        Show
        Jonathan Ellis added a comment - updated 02 w/ fixes
        Jonathan Ellis made changes -
        Hide
        Brandon Williams added a comment -

        +1, approximate 10% gain on both reads and writes.

        Show
        Brandon Williams added a comment - +1, approximate 10% gain on both reads and writes.
        Hide
        Jonathan Ellis added a comment -

        committed

        Show
        Jonathan Ellis added a comment - committed
        Jonathan Ellis made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Gavin made changes -
        Workflow no-reopen-closed, patch-avail [ 12636876 ] patch-available, re-open possible [ 12748782 ]
        Gavin made changes -
        Workflow patch-available, re-open possible [ 12748782 ] reopen-resolved, no closed status, patch-avail, testing [ 12753895 ]

          People

          • Assignee:
            Jonathan Ellis
            Reporter:
            Jonathan Ellis
            Reviewer:
            Brandon Williams
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development