Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-43311

RocksDB state store provider memory management enhancements

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.5.0
    • Structured Streaming
    • None

    Description

      Today when RocksDB is used as a State Store provider, memory usage when writing using writeBatch is not capped. Also, a related issue is that the state store coordinator can create multiple RocksDB instances on a single node without enforcing a global limit on native memory usage. Due to these issues we could run into OOM issues and task failures. 
       
      We are looking to improve this behavior by doing a series of improvements such as:

      • remove writeBatch and use native RocksDB operations
      • use writeBufferManager to manage global limit for all instances on a single node and accounting memtable + filter/index blocks usage as part of block cacheWith these changes we will be avoiding OOM issues around RocksDB native memory usage.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            anishshri-db Anish Shrigondekar
            anishshri-db Anish Shrigondekar
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment