Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25091

UNCACHE TABLE, CLEAR CACHE, rdd.unpersist() does not clean up executor memory

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • 2.3.1
    • None
    • SQL
    • None

    Description

      UNCACHE TABLE and CLEAR CACHE does not clean up executor memory.

      Through Spark UI, although in Storage, we see the cached table removed. In Executor, the executors continue to hold the RDD and the memory is not cleared. This results in huge waste in executor memory usage. As we call CACHE TABLE, we run into issues where the cached tables are spilled to disk instead of reclaiming the memory storage. 

      Steps to reproduce:

      CACHE TABLE test.test_cache;

      UNCACHE TABLE test.test_cache;

      == Storage shows table is not cached; Executor shows the executor storage memory does not change == 

      CACHE TABLE test.test_cache;

      CLEAR CACHE;

      == Storage shows table is not cached; Executor shows the executor storage memory does not change == 

      Similar behavior when using pyspark df.unpersist().

      Attachments

        1. 4.png
          35 kB
          Chao Fang
        2. 3.png
          18 kB
          Chao Fang
        3. 2.png
          18 kB
          Chao Fang
        4. 1.png
          18 kB
          Chao Fang
        5. 0.png
          21 kB
          Chao Fang

        Issue Links

          Activity

            People

              Unassigned Unassigned
              cyunling Yunling Cai
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: