Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44239

Free memory allocated by large vectors when vectors are reset

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.5.0
    • 4.0.0
    • SQL
    • None

    Description

      When spark reads a data file into a WritableColumnVector, the memory allocated by the WritableColumnVectors is not freed until the VectorizedColumnReader completes.

      It will save memory allocation time by reusing the allocated array objects. But it also takes up too many unused memory after the current large vector batch has been read.

      Add a memory reserve policy for this scenario which will reuse the allocated array object for small column vectors and free the memory for huge column vectors.

       

      Attachments

        1. image-2023-06-29-12-58-12-256.png
          257 kB
          Wan Kun
        2. image-2023-06-29-13-03-15-470.png
          124 kB
          Wan Kun

        Activity

          People

            wankun Wan Kun
            wankun Wan Kun
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: