[SPARK-27977] MicroBatchWriter should use StreamWriter for human-friendly textual representation (toString) - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: 3.1.0
Fix Version/s: None
Component/s: Structured Streaming
Labels:
None

Description

The following is a extended explain for a streaming query:

== Parsed Logical Plan ==
WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWriter@4737caef
+- Project [value#39 AS value#0]
   +- Streaming RelationV2 socket[value#39] (Options: [host=localhost,port=8888])

== Analyzed Logical Plan ==
WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWriter@4737caef
+- Project [value#39 AS value#0]
   +- Streaming RelationV2 socket[value#39] (Options: [host=localhost,port=8888])

== Optimized Logical Plan ==
WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWriter@4737caef
+- Streaming RelationV2 socket[value#39] (Options: [host=localhost,port=8888])

== Physical Plan ==
WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWriter@4737caef
+- *(1) Project [value#39]
   +- *(1) ScanV2 socket[value#39] (Options: [host=localhost,port=8888])

As you may have noticed, WriteToDataSourceV2 is followed by the internal representation of MicroBatchWriter that is a mere adapter for StreamWriter, e.g. ConsoleWriter.

It'd be more debugging-friendly if the plans included whatever StreamWriter.toString would (which in case of ConsoleWriter would be ConsoleWriter[numRows=..., truncate=...] which gives more context).

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Jacek Laskowski

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 07/Jun/19 10:57

Updated:: 16/Mar/20 22:50