Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11361

Show scopes of RDD operations inside DStream.foreachRDD and DStream.transform in DAG viz

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: DStreams
    • Labels:
      None
    • Target Version/s:

      Description

      Currently, when a DStream sets the scope for RDD generated by it, that scope is not allowed to be overridden by the RDD operations. So in case of `DStream.foreachRDD`, all the RDDs generated inside the foreachRDD get the same scope - `foreachRDD @ <time>`, as set by the `ForeachDStream`. So it is hard to debug generated RDDs in the RDD DAG viz in the Spark UI.

      This JIRA is to allow the RDD operations inside `DStream.transform` and `DStream.foreachRDD` to append their own scopes to the earlier DStream scope.

        Attachments

          Activity

            People

            • Assignee:
              tdas Tathagata Das
              Reporter:
              tdas Tathagata Das
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: