Description
We currently rely on FileFormat implementations to override toString in order to get a proper explain output. It'd be better to just depend on shortName for those.
Before:
scala> spark.read.text("test.text").explain() == Physical Plan == *FileScan text [value#15] Batched: false, Format: org.apache.spark.sql.execution.datasources.text.TextFileFormat@xyz, Location: InMemoryFileIndex[file:/scratch/rxin/spark/test.text], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<value:string>
After:
scala> spark.read.text("test.text").explain() == Physical Plan == *FileScan text [value#15] Batched: false, Format: text, Location: InMemoryFileIndex[file:/scratch/rxin/spark/test.text], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<value:string>
Attachments
Issue Links
- duplicates
-
SPARK-17101 Provide consistent format identifiers for TextFileFormat and ParquetFileFormat
- Resolved
- links to