Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-37481

Disappearance of skipped stages mislead the bug hunting

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.2, 3.2.0, 3.3.0
    • 3.2.1, 3.3.0
    • Spark Core
    • None

    Description

        1. With FetchFailedException and Map Stage Retries

      When rerunning spark-sql shell with the original SQL in https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315

      1. stage 3 threw FetchFailedException and caused itself and its parent stage(stage 2) to retry
      2. stage 2 was skipped before but its attemptId was still 0, so when its retry happened it got removed from `Skipped Stages` 

      The DAG of Job 2 doesn't show that stage 2 is skipped anymore.

      Besides, a retried stage usually has a subset of tasks from the original stage. If we mark it as an original one, the metrics might lead us into pitfalls.

      Attachments

        Activity

          People

            Qin Yao Kent Yao
            Qin Yao Kent Yao
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: