Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-37481

Disappearance of skipped stages mislead the bug hunting

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.2, 3.2.0, 3.3.0
    • 3.2.1, 3.3.0
    • Spark Core
    • None

    Description

        1. With FetchFailedException and Map Stage Retries

      When rerunning spark-sql shell with the original SQL in https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315

      1. stage 3 threw FetchFailedException and caused itself and its parent stage(stage 2) to retry
      2. stage 2 was skipped before but its attemptId was still 0, so when its retry happened it got removed from `Skipped Stages` 

      The DAG of Job 2 doesn't show that stage 2 is skipped anymore.

      Besides, a retried stage usually has a subset of tasks from the original stage. If we mark it as an original one, the metrics might lead us into pitfalls.

      Attachments

        Activity

          People

            Qin Yao Kent Yao 2
            Qin Yao Kent Yao 2
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: