Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19659 Fetch big blocks to disk when shuffle-read
  3. SPARK-20801

Store accurate size of blocks in MapStatus when it's above threshold.

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.1
    • 2.2.0
    • Spark Core
    • None

    Description

      Currently, when number of reduces is above 2000, HighlyCompressedMapStatus is used to store size of blocks. in HighlyCompressedMapStatus, only average size is stored for non empty blocks. Which is not good for memory control when we shuffle blocks. It makes sense to store the accurate size of block when it's above threshold.

      Attachments

        Activity

          People

            jinxing6042@126.com Jin Xing
            jinxing6042@126.com Jin Xing
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: