Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19659 Fetch big blocks to disk when shuffle-read
  3. SPARK-20801

Store accurate size of blocks in MapStatus when it's above threshold.

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.1.1
    • Fix Version/s: 2.2.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      Currently, when number of reduces is above 2000, HighlyCompressedMapStatus is used to store size of blocks. in HighlyCompressedMapStatus, only average size is stored for non empty blocks. Which is not good for memory control when we shuffle blocks. It makes sense to store the accurate size of block when it's above threshold.

        Attachments

          Activity

            People

            • Assignee:
              jinxing6042@126.com Jin Xing
              Reporter:
              jinxing6042@126.com Jin Xing
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: