Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24539

HistoryServer does not display metrics from tasks that complete after stage failure

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.3.1
    • None
    • Web UI
    • None

    Description

      I noticed that task metrics for completed tasks with a stage failure do not show up in the new history server. I have a feeling this is because all of the tasks succeeded after the stage had been failed (so they were completions from a "zombie" taskset). The task metrics (eg. the shuffle read size & shuffle write size) do not show up at all, either in the task table, the executor table, or the overall stage summary metrics. (they might not show up in the job summary page either, but in the event logs I have, there is another successful stage attempt after this one, and that is the only thing which shows up in the jobs page.) If you get task details from the api endpoint (eg. http://[host]:[port]/api/v1/applications/[app-id]/stages/[stage-id]/[stage-attempt]) then you can see the successful tasks and all the metrics

      Unfortunately the event logs I have are huge and I don't have a small repro handy, but I hope that description is enough to go on.

      I loaded the event logs I have in the SHS from spark 2.2 and they appear fine.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              irashid Imran Rashid
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: