Type: New Feature
Resolution: Won't Fix
Fix Version/s: None
For improving Cassandra performance, I implemented a Cassandra RPC part with MessagePack. The implementation details are attached as a patch. The patch works on Cassandra 0.7.0-beta3. Please check it.
MessagePack is one of object serialization libraries for cross-languages like Thrift and Protocol Buffers but it is much faster, small, and easy to implement. MessagePack allows reducing serialization cost and data size in network and disk.
MessagePack websites are
- website: http://msgpack.org/
This website compares MessagePack, Thrift and JSON.
- desing details: http://redmine.msgpack.org/projects/msgpack/wiki/FormatDesign
- source code: https://github.com/msgpack/msgpack/
Performance of the data serialization library is one of the most important issues for developing a distributed database in Java. If the performance is bad, it significantly reduces the overall database performance. Java's GC also runs many times. Cassandra has this problem as well.
For reducing data size in network between a client and Cassandra, I prototyped the implementation of a Cassandra RPC part with MessagePack and MessagePack-RPC. The implementation is very simple. MessagePack-RPC can reuse the existing Thrift based CassandraServer (org.apache.cassandra.thrift.CassandraServer)
while adapting MessagePack's communication protocol and data serialization.
Major features of MessagePack-RPC are
- Asynchronous RPC
- Parallel Pipelining
- Connection pooling
- Delayed return
- Event-driven I/O
- more details: http://redmine.msgpack.org/projects/msgpack/wiki/RPCDesign
- source code: https://github.com/msgpack/msgpack-rpc/
The attached patch includes a ring cache program for MessagePack and its test program.
You can check the behavior of the Cassandra RPC with MessagePack.
Thanks in advance,