Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8202

PySpark: infinite loop during external sort

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.4.0
    • Fix Version/s: 1.4.1, 1.5.0
    • Component/s: PySpark
    • Labels:
      None

      Description

      The batch size during external sort will grow up to max 10000, then shrink down to zero, causing infinite loop.

      Given the assumption that the items usually have similar size, so we don't need to adjust the batch size after first spill.

        Attachments

          Activity

            People

            • Assignee:
              davies Davies Liu
              Reporter:
              davies Davies Liu
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: