Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26260

Task Summary Metrics for Stage Page: Efficient implementation for SHS when using disk store.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • Spark Core
    • None

    Description

      Currently, tasks summary metrics is calculated based on all the tasks, instead of successful tasks.
      After the JIRA, https://issues.apache.org/jira/browse/SPARK-26119, when using InMemory store, it find task summary metrics for all the successful tasks metrics. But we need to find an efficient implementation for disk store case for SHS. The main bottle neck for disk store is deserialization time overhead.

      Hints: Need to rework on the way indexing works, so that we can index by specific metrics for successful and failed tasks differently (would be tricky). Also would require changing the disk store version (to invalidate old stores).

      OR any other efficient solutions.

      Attachments

        Issue Links

          Activity

            People

              shahid shahid
              shahid shahid
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: