[SPARK-22013] Allow to read the results of a streaming query as non-streaming datasource - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Wish
Status: Resolved
Priority: Minor
Resolution: Incomplete
Affects Version/s: 2.2.0
Fix Version/s: None
Component/s: SQL
Labels:
- bulk-closed
Environment:

All

Description

It would be great to have ability to read the results of a streaming query as non-streaming datasource, i.e. skipping reading _spark_metadata, because in some use-cases datasource is being modified by external tools (for example - combining small Parquet/ORC files with Hadoop rather than Spark) leaving _spark_metadata outdated. This in turn can cause errors if metadata refers to files being deleted or moved.

Currently there is no way to override this behavior.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Ivan Sharamet

Votes:: 1 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 14/Sep/17 12:51

Updated:: 21/May/19 04:11

Resolved:: 21/May/19 04:11