Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: kv
    • Labels:
      None

      Description

      The samza-kv implementation is currently using LevelDB. RocksDB seems to have a number of interesting options that would be useful to Samza:

      1. Ability to swap out skip list for a faster implementation when restoring data from a changelog stream.
      2. Ability to disable compaction during changelog stream restoration.
      3. Ability to have a single level (or single SST) rather than multiple levels in order to reduce read amplification.
      4. A merge operation for doing lazy read-modify-write during compaction.

      And a lot more.

      1. SAMZA-236.diff
        20 kB
        Naveen Somasundaram
      2. SAMZA-236-1.patch
        10 kB
        Chris Riccomini

        Issue Links

          Activity

          Hide
          criccomini Chris Riccomini added a comment -

          Make sure to delete the hacky maybeCompact stuff in the LevelDB kv store when we switch to RocksDB.

          Show
          criccomini Chris Riccomini added a comment - Make sure to delete the hacky maybeCompact stuff in the LevelDB kv store when we switch to RocksDB.
          Hide
          jghoman Jakob Homan added a comment -

          Also, when we do this, we should move the new class to its own package (org.apache.samza.storage.kv.rocksdb). Currently the leveldb implementation-specific files make up one-third of all the files in the kv package:

          • CachedStore
          • CachedStoreMetrics
          • LevelDbKeyValueStore
          • LevelDbKeyValueStoreMetrics

          This will become messier as more stores are added; each should be in its own package.

          Show
          jghoman Jakob Homan added a comment - Also, when we do this, we should move the new class to its own package (org.apache.samza.storage.kv.rocksdb). Currently the leveldb implementation-specific files make up one-third of all the files in the kv package: CachedStore CachedStoreMetrics LevelDbKeyValueStore LevelDbKeyValueStoreMetrics This will become messier as more stores are added; each should be in its own package.
          Hide
          criccomini Chris Riccomini added a comment -

          Taking this one step farther, I'd like to think about separating samza-kv-leveldb/samza-kv-rocksdb into their own subprojects. This will separate dependencies between the two projects. Not entirely sold on the idea, but I'm not wild about putting LevelDB and RocksDB implementations into the subproject either. It's worth thinking through.

          Show
          criccomini Chris Riccomini added a comment - Taking this one step farther, I'd like to think about separating samza-kv-leveldb/samza-kv-rocksdb into their own subprojects. This will separate dependencies between the two projects. Not entirely sold on the idea, but I'm not wild about putting LevelDB and RocksDB implementations into the subproject either. It's worth thinking through.
          Hide
          jghoman Jakob Homan added a comment -

          +1. Anytime we're introducing a tight coupling focused on a single external project, it will be good to keep it as fenced off from the other components. This will be true not just for the stores, but also out-of-the-box supported producers and consumers (HDFS, S3, etc.).

          Show
          jghoman Jakob Homan added a comment - +1. Anytime we're introducing a tight coupling focused on a single external project, it will be good to keep it as fenced off from the other components. This will be true not just for the stores, but also out-of-the-box supported producers and consumers (HDFS, S3, etc.).
          Hide
          criccomini Chris Riccomini added a comment -

          This will be true not just for the stores, but also out-of-the-box supported producers and consumers (HDFS, S3, etc.).

          110% agree.

          Other random thought: perhaps we just want to ditch LevelDB entirely? There is some risk with doing so, since RocksDB will surely have its own quirks, but I feel like it might make sense to just have one since they're basically the same thing. It's hard to justify keeping LevelDB around when RocksDB is running smoothly.

          Show
          criccomini Chris Riccomini added a comment - This will be true not just for the stores, but also out-of-the-box supported producers and consumers (HDFS, S3, etc.). 110% agree. Other random thought: perhaps we just want to ditch LevelDB entirely? There is some risk with doing so, since RocksDB will surely have its own quirks, but I feel like it might make sense to just have one since they're basically the same thing. It's hard to justify keeping LevelDB around when RocksDB is running smoothly.
          Hide
          sriramsub Sriram Subramanian added a comment -

          1. samza-kv-leveldb and samza-kv-rocksdb makes sense to me. The implementation of the kv store is mixed with one specific implementation (leveldb) now. With more stores, having them separate is useful.
          2. We should implement RocksDB kv and see how it performs. If it solves all our problems, we can remove support for leveldb.

          Show
          sriramsub Sriram Subramanian added a comment - 1. samza-kv-leveldb and samza-kv-rocksdb makes sense to me. The implementation of the kv store is mixed with one specific implementation (leveldb) now. With more stores, having them separate is useful. 2. We should implement RocksDB kv and see how it performs. If it solves all our problems, we can remove support for leveldb.
          Hide
          criccomini Chris Riccomini added a comment -

          2. We should implement RocksDB kv and see how it performs. If it solves all our problems, we can remove support for leveldb.

          I'm leaning this way as well.

          Show
          criccomini Chris Riccomini added a comment - 2. We should implement RocksDB kv and see how it performs. If it solves all our problems, we can remove support for leveldb. I'm leaning this way as well.
          Hide
          criccomini Chris Riccomini added a comment -

          I've opened up a Github issue with RocksDB to track their Maven publication.

          Show
          criccomini Chris Riccomini added a comment - I've opened up a Github issue with RocksDB to track their Maven publication.
          Hide
          criccomini Chris Riccomini added a comment -

          I've opened up a JIRA with Sonatype.

          Show
          criccomini Chris Riccomini added a comment - I've opened up a JIRA with Sonatype.
          Hide
          criccomini Chris Riccomini added a comment -

          Naveen, could you please post the diff here? I noticed you just did the RB. Posting here is what grants Apache rights.

          Show
          criccomini Chris Riccomini added a comment - Naveen , could you please post the diff here? I noticed you just did the RB. Posting here is what grants Apache rights.
          Hide
          criccomini Chris Riccomini added a comment - - edited

          RB at: https://reviews.apache.org/r/26059/

          Note, this RB is dependent on the Sonatype RocksDB release that I'm working on here. The current patch appears to add a "libs" directory where you can stick the locally-built rocksdb jar, once I've posted the RocksDB patch for it.

          In short, this patch probably wont build for anyone right now.

          Show
          criccomini Chris Riccomini added a comment - - edited RB at: https://reviews.apache.org/r/26059/ Note, this RB is dependent on the Sonatype RocksDB release that I'm working on here . The current patch appears to add a "libs" directory where you can stick the locally-built rocksdb jar, once I've posted the RocksDB patch for it. In short, this patch probably wont build for anyone right now.
          Hide
          criccomini Chris Riccomini added a comment -

          Naveen, I've applied the patch locally, alongside my patch. It doesn't seem to compile.

          /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:19: error: value setCacheSize is not a member of org.rocksdb.Options
              options.setCacheSize(cacheSize / containerContext.taskNames.size)
                      ^
          /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:21: error: value setBlockSize is not a member of org.rocksdb.Options
              options.setBlockSize(storeConfig.getInt("rocksdb.block.size.bytes", 4096))
                      ^
          /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:30: error: value setFilter is not a member of org.rocksdb.Options
              options.setFilter(new BloomFilter);
                      ^
          /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:32: error: value setCacheSize is not a member of org.rocksdb.Options
              options.setCacheSize(128*1024*1024)
                      ^
          /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:49: error: not found: type Logging
            val metrics: KeyValueStoreMetrics = new KeyValueStoreMetrics) extends KeyValueStore[Array[Byte], Array[Byte]] with Logging {
                                                                                                                               ^
          /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:126: error: not found: value trace
              trace("Flush in RocksDbKeyValueStore is not supported, ignoring")
              ^
          /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:130: error: not found: value trace
              trace("Closing.")
              ^
          /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/TestRocksDbApi.scala:15: error: value setDisableSeekCompaction is not a member of org.rocksdb.Options
              options.setCreateIfMissing(true).createStatistics().setWriteBufferSize(8 * SizeUnit.KB).setMaxWriteBufferNumber(3).setDisableSeekCompaction(
                                                                                                                                 ^
          /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/TestRocksDbApi.scala:22: error: value disableSeekCompaction is not a member of org.rocksdb.Options
              assert(options.disableSeekCompaction() == true);
                             ^
          /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/TestRocksDbApi.scala:23: error: value blockSize is not a member of org.rocksdb.Options
              assert(options.blockSize() == 64 * SizeUnit.KB);
                             ^
          10 errors found
          :samza-kv-rocksdb_2.10:compileScala FAILED
          
          FAILURE: Build failed with an exception.
          

          Sure enough, these methods don't seem to be in RocksDB's master branch. Do I need to apply some patch to master to get this to work? Thought your patches were merged in?

          Show
          criccomini Chris Riccomini added a comment - Naveen , I've applied the patch locally, alongside my patch . It doesn't seem to compile. /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:19: error: value setCacheSize is not a member of org.rocksdb.Options options.setCacheSize(cacheSize / containerContext.taskNames.size) ^ /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:21: error: value setBlockSize is not a member of org.rocksdb.Options options.setBlockSize(storeConfig.getInt("rocksdb.block.size.bytes", 4096)) ^ /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:30: error: value setFilter is not a member of org.rocksdb.Options options.setFilter(new BloomFilter); ^ /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:32: error: value setCacheSize is not a member of org.rocksdb.Options options.setCacheSize(128*1024*1024) ^ /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:49: error: not found: type Logging val metrics: KeyValueStoreMetrics = new KeyValueStoreMetrics) extends KeyValueStore[Array[Byte], Array[Byte]] with Logging { ^ /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:126: error: not found: value trace trace("Flush in RocksDbKeyValueStore is not supported, ignoring") ^ /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala:130: error: not found: value trace trace("Closing.") ^ /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/TestRocksDbApi.scala:15: error: value setDisableSeekCompaction is not a member of org.rocksdb.Options options.setCreateIfMissing(true).createStatistics().setWriteBufferSize(8 * SizeUnit.KB).setMaxWriteBufferNumber(3).setDisableSeekCompaction( ^ /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/TestRocksDbApi.scala:22: error: value disableSeekCompaction is not a member of org.rocksdb.Options assert(options.disableSeekCompaction() == true); ^ /Users/criccomi/Code/incubator-samza/samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/TestRocksDbApi.scala:23: error: value blockSize is not a member of org.rocksdb.Options assert(options.blockSize() == 64 * SizeUnit.KB); ^ 10 errors found :samza-kv-rocksdb_2.10:compileScala FAILED FAILURE: Build failed with an exception. Sure enough, these methods don't seem to be in RocksDB's master branch. Do I need to apply some patch to master to get this to work? Thought your patches were merged in?
          Hide
          naveenatceg Naveen Somasundaram added a comment -

          Hey Chris Riccomini, I will add a patch to the ticket as well. As per the build, that's weird, I don't see it the repository as well, but a quick google search shows me that those options did exist.
          https://github.com/vpallabs/vpal/blob/master/src/rocksdb/java/RocksDBSample.java

              Filter filter = new BloomFilter(10);
              options.setCreateIfMissing(true)
                  .createStatistics()
                  .setWriteBufferSize(8 * SizeUnit.KB)
                  .setMaxWriteBufferNumber(3)
                  .setDisableSeekCompaction(true)
                  .setBlockSize(64 * SizeUnit.KB)
                  .setMaxBackgroundCompactions(10)
                  .setFilter(filter);
              Statistics stats = options.statisticsPtr();
          
              assert(options.createIfMissing() == true);
              assert(options.writeBufferSize() == 8 * SizeUnit.KB);
              assert(options.maxWriteBufferNumber() == 3);
              assert(options.disableSeekCompaction() == true);
              assert(options.blockSize() == 64 * SizeUnit.KB);
              assert(options.maxBackgroundCompactions() == 10);
          

          Let me check if they have changed anything in the master.

          Show
          naveenatceg Naveen Somasundaram added a comment - Hey Chris Riccomini , I will add a patch to the ticket as well. As per the build, that's weird, I don't see it the repository as well, but a quick google search shows me that those options did exist. https://github.com/vpallabs/vpal/blob/master/src/rocksdb/java/RocksDBSample.java Filter filter = new BloomFilter(10); options.setCreateIfMissing( true ) .createStatistics() .setWriteBufferSize(8 * SizeUnit.KB) .setMaxWriteBufferNumber(3) .setDisableSeekCompaction( true ) .setBlockSize(64 * SizeUnit.KB) .setMaxBackgroundCompactions(10) .setFilter(filter); Statistics stats = options.statisticsPtr(); assert (options.createIfMissing() == true ); assert (options.writeBufferSize() == 8 * SizeUnit.KB); assert (options.maxWriteBufferNumber() == 3); assert (options.disableSeekCompaction() == true ); assert (options.blockSize() == 64 * SizeUnit.KB); assert (options.maxBackgroundCompactions() == 10); Let me check if they have changed anything in the master.
          Hide
          criccomini Chris Riccomini added a comment -

          Hmm, yeah it looks to exist on https://github.com/facebook/rocksdb/blob/master/java/org/rocksdb/Options.java, but not on my local checkout. Let me see what's going on.

          Show
          criccomini Chris Riccomini added a comment - Hmm, yeah it looks to exist on https://github.com/facebook/rocksdb/blob/master/java/org/rocksdb/Options.java , but not on my local checkout. Let me see what's going on.
          Hide
          naveenatceg Naveen Somasundaram added a comment -

          Chris Riccomini I just did a rebuild and I see some of the options are no longer working (specifically, setCacheSize, setBlockSize, setFilter). I'm checking what has changed recently.

          Show
          naveenatceg Naveen Somasundaram added a comment - Chris Riccomini I just did a rebuild and I see some of the options are no longer working (specifically, setCacheSize, setBlockSize, setFilter). I'm checking what has changed recently.
          Hide
          naveenatceg Naveen Somasundaram added a comment -

          Chris Riccomini They have changed the API after I forked the branch.
          https://github.com/facebook/rocksdb/commit/66f62e5c78a189c910424442b5634467dd609530
          I need to figure out how this translates to the newly introduced Options/API. I'll submit a another patch with that fix.

          Show
          naveenatceg Naveen Somasundaram added a comment - Chris Riccomini They have changed the API after I forked the branch. https://github.com/facebook/rocksdb/commit/66f62e5c78a189c910424442b5634467dd609530 I need to figure out how this translates to the newly introduced Options/API. I'll submit a another patch with that fix.
          Hide
          criccomini Chris Riccomini added a comment -

          K, sounds good.

          Show
          criccomini Chris Riccomini added a comment - K, sounds good.
          Hide
          criccomini Chris Riccomini added a comment -

          I've posted the initial RocksDB pull request here. Accompanying Github issue is here. Awaiting reviews.

          Show
          criccomini Chris Riccomini added a comment - I've posted the initial RocksDB pull request here . Accompanying Github issue is here . Awaiting reviews.
          Hide
          naveenatceg Naveen Somasundaram added a comment -

          Chris Riccomini I have uploaded the patch with the revised API. This should unblock you for now, I am still yet to address the review comments in this patch.

          Show
          naveenatceg Naveen Somasundaram added a comment - Chris Riccomini I have uploaded the patch with the revised API. This should unblock you for now, I am still yet to address the review comments in this patch.
          Hide
          criccomini Chris Riccomini added a comment -

          I test your patch with my latest pull request. It worked on my Mac. On my RHEL box, I got:

          :samza-kv-leveldb_2.10:test
          
          getNonExistantIsNull[8] FAILED
              java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni4046465896437536234..so: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /tmp/librocksdbjni4046465896437536234..so)
                  at java.lang.ClassLoader$NativeLibrary.load(Native Method)
                  at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1929)
                  at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1814)
                  at java.lang.Runtime.load0(Runtime.java:809)
                  at java.lang.System.load(System.java:1083)
                  at org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:51)
                  at org.rocksdb.RocksDB.loadLibrary(RocksDB.java:50)
                  at org.rocksdb.RocksDB.<clinit>(RocksDB.java:28)
                  at org.rocksdb.Options.<clinit>(Options.java:17)
                  at org.apache.samza.storage.kv.TestKeyValueStores.setup(TestKeyValueStores.scala:65)
          
          Show
          criccomini Chris Riccomini added a comment - I test your patch with my latest pull request . It worked on my Mac. On my RHEL box, I got: :samza-kv-leveldb_2.10:test getNonExistantIsNull[8] FAILED java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni4046465896437536234..so: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /tmp/librocksdbjni4046465896437536234..so) at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1929) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1814) at java.lang.Runtime.load0(Runtime.java:809) at java.lang.System.load(System.java:1083) at org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:51) at org.rocksdb.RocksDB.loadLibrary(RocksDB.java:50) at org.rocksdb.RocksDB.<clinit>(RocksDB.java:28) at org.rocksdb.Options.<clinit>(Options.java:17) at org.apache.samza.storage.kv.TestKeyValueStores.setup(TestKeyValueStores.scala:65)
          Hide
          criccomini Chris Riccomini added a comment -

          Running the same test on my Debian Vagrant image, the tests pass. So it looks like some incompatibility between Linux flavors. Any ideas?

          Show
          criccomini Chris Riccomini added a comment - Running the same test on my Debian Vagrant image, the tests pass. So it looks like some incompatibility between Linux flavors. Any ideas?
          Hide
          criccomini Chris Riccomini added a comment -

          Saw this:

          Linux build tips: Build on REHL 5.6 for best compatibility across linux versions.

          On the fuse source page. Going to try building on a RHEL 5.6 virtualbox.

          Show
          criccomini Chris Riccomini added a comment - Saw this: Linux build tips: Build on REHL 5.6 for best compatibility across linux versions. On the fuse source page. Going to try building on a RHEL 5.6 virtualbox.
          Hide
          criccomini Chris Riccomini added a comment -

          Latest patch now builds against CentOS 5.6 Vagrant VMs. This appears to have fixed the GLIBC problem that I was seeing.

          I have run the RocksDB tests attached to this JIRA with the fat jar produced from my RocksDB patch. The tests pass on Mac OSX 64-bit, Linux 32-bit Ubuntu Trusty, Linux 32-bit CentOS 5.6, Linux 64-bit CentOS 5.6, and Linux 64-bit RHEL 6.4.

          Show
          criccomini Chris Riccomini added a comment - Latest patch now builds against CentOS 5.6 Vagrant VMs. This appears to have fixed the GLIBC problem that I was seeing. I have run the RocksDB tests attached to this JIRA with the fat jar produced from my RocksDB patch. The tests pass on Mac OSX 64-bit, Linux 32-bit Ubuntu Trusty, Linux 32-bit CentOS 5.6, Linux 64-bit CentOS 5.6, and Linux 64-bit RHEL 6.4.
          Hide
          criccomini Chris Riccomini added a comment -

          RocksDB cross-platform fat jar has been merged into RocksDB trunk.

          https://github.com/facebook/rocksdb/pull/318

          Working with the FB folks to publish to Maven now.

          Show
          criccomini Chris Riccomini added a comment - RocksDB cross-platform fat jar has been merged into RocksDB trunk. https://github.com/facebook/rocksdb/pull/318 Working with the FB folks to publish to Maven now.
          Hide
          criccomini Chris Riccomini added a comment -

          RocksDB 3.5.1 is now in Maven central, and ready for use.

          Show
          criccomini Chris Riccomini added a comment - RocksDB 3.5.1 is now in Maven central , and ready for use.
          Hide
          criccomini Chris Riccomini added a comment -

          Attaching Naveen Somasundaram's updated patch from the RB. I also made a couple of minor changes:

          1. Eliminated samza-kv-rocksdb dependency in samza-kv-leveldb. Added dependency in samza-test.
          2. Added javadocs to lex comparator.
          3. Fixed some spacing/bracket issues.
          4. Removed camel case configs in RocksDB options object.
          Show
          criccomini Chris Riccomini added a comment - Attaching Naveen Somasundaram 's updated patch from the RB. I also made a couple of minor changes: Eliminated samza-kv-rocksdb dependency in samza-kv-leveldb. Added dependency in samza-test. Added javadocs to lex comparator. Fixed some spacing/bracket issues. Removed camel case configs in RocksDB options object.
          Hide
          criccomini Chris Riccomini added a comment -

          I have merged in the latest patch.

          We will also need to update the Samza website/docs with RocksDB-related documentation. I've opened up SAMZA-434 for this.

          Lastly, I've opened up SAMZA-435 to track removing LevelDB from Samza once we're happy with RocksDB.

          Show
          criccomini Chris Riccomini added a comment - I have merged in the latest patch. We will also need to update the Samza website/docs with RocksDB-related documentation. I've opened up SAMZA-434 for this. Lastly, I've opened up SAMZA-435 to track removing LevelDB from Samza once we're happy with RocksDB.

            People

            • Assignee:
              naveenatceg Naveen Somasundaram
              Reporter:
              criccomini Chris Riccomini
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development