Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-873

Avoid unnecessary flushes in CachedStore

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.10.1
    • Component/s: kv
    • Labels:
      None
    • Flags:
      Patch

      Description

      The class org.apache.samza.storage.kv.CachedStore is currently calling store.flush() when evicting dirty entries. This in turn causes RocksDB to flush its memtables much more than necessary, causing slowdowns.

      In a mixed put / get workload, e.g. 2 gets for 1 put with an object cache size of 1000, RocksDB will flush its memtable roughly every 333 calls to put(); that is every time the eldest entry from the cache is dirty. In our benchmarks, this leads to a more than 20x drop in throughput.

      The attached patch fixes the issue as follows:

      • CachedStore.put() no longer flushes when evicting dirty entries.
        It calls store.putAll() with all dirty entries and resets the dirty list and count but does not call store.flush().
      • Likewise, CachedStore.cache.removeEldestEntry() no longer flushes when evicting dirty entries.
        It calls store.putAll() on all dirty entries and resets the dirty list and count.
      • The behavior of CachedStore.flush() is unaffected.

        Attachments

          Activity

            People

            • Assignee:
              nicolas@movio.co Nicolas Maquet
              Reporter:
              nicolas@movio.co Nicolas Maquet
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: