Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41339

RocksDB state store WriteBatch doesn't clean up native memory

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.1
    • 3.3.2, 3.4.0
    • SQL, Structured Streaming
    • None

    Description

      The RocksDB state store uses a WriteBatch to hold updates that get written in a single transaction to commit. Somewhat indirectly abort is called after a successful task which calls writeBatch.clear(), but the data for a writeBatch is stored in a std::string in the native code. Not sure why it's stored as a string, but it is. rocksdb/write_batch.h at main · facebook/rocksdb · GitHub

      writeBatch.clear simply calls rep_.clear() and rep._resize() (rocksdb/write_batch.cc at main · facebook/rocksdb · GitHub), neither of which actually releases the memory built up by a std::string instance. The only way to actually release this memory is to delete the WriteBatch object itself.

      Currently, all memory taken by all write batches will remain until the RocksDB state store instance is closed, which never happens during the normal course of operation as all partitions remain loaded on an executor after a task completes.

      Attachments

        Activity

          People

            kimahriman Adam Binford
            kimahriman Adam Binford
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: