Details
-
Wish
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
2.2.0
-
None
-
All
Description
It would be great to have ability to read the results of a streaming query as non-streaming datasource, i.e. skipping reading _spark_metadata, because in some use-cases datasource is being modified by external tools (for example - combining small Parquet/ORC files with Hadoop rather than Spark) leaving _spark_metadata outdated. This in turn can cause errors if metadata refers to files being deleted or moved.
Currently there is no way to override this behavior.