Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13474

Cassandra pluggable storage engine

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • None
    • Legacy/Core
    • None

    Description

      Instagram is working on a project to significantly reduce Cassandra's tail latency, by implementing a new storage engine on top of RocksDB, named Rocksandra.

      We started a prototype of single column (key-value) use case, and then implemented a full design to support most of the data types and data models in Cassandra, as well as streaming.

      After a year of development and testing, we have rolled out the Rocksandra project to our internal deployments, and observed 3-4X reduction on P99 read latency in general, even more than 10 times reduction for some use cases.

      We published a blog post about the wins and the benchmark metrics on AWS environment. https://engineering.instagram.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589

      I think the biggest performance win comes from we get rid of most Java garbages created by current read/write path and compactions, which reduces the JVM overhead and makes the latency to be more predictable.

      We are very excited about the potential performance gain. As the next step, I propose to make the Cassandra storage engine to be pluggable (like Mysql and MongoDB), and we are very interested in providing RocksDB as one storage option with more predictable performance, together with community.

      Design doc for pluggable storage engine: https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc/edit

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dikanggu Dikang Gu
              Votes:
              18 Vote for this issue
              Watchers:
              58 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m