Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29337

How to Cache Table and Pin it in Memory and should not Spill to Disk on Thrift Server

    XMLWordPrintableJSON

    Details

    • Type: Question
    • Status: Resolved
    • Priority: Major
    • Resolution: Invalid
    • Affects Version/s: 2.3.0
    • Fix Version/s: None
    • Component/s: SQL

      Description

      Hi Team,

      How to pin the table in cache so it would not swap out of memory?

      Situation: We are using Microstrategy BI reporting. Semantic layer is built. We wanted to Cache highly used tables into CACHE using Spark SQL CACHE Table <table_name>; we did cache for SPARK context( Thrift server). Please see below snapshot of Cache table, went to disk over time. Initially it was all in cache , now some in cache and some in disk. That disk may be local disk relatively more expensive reading than from s3. Queries may take longer and inconsistent times from user experience perspective. If More queries running using Cache tables, copies of the cache table images are copied and copies are not staying in memory causing reports to run longer. so how to pin the table so would not swap to disk. Spark memory management is dynamic allocation, and how to use those few tables to Pin in memory .

        Attachments

        1. Cache+Image.png
          19 kB
          Srini E

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              idfspark Srini E
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: