Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29953

File stream source cleanup options may break a file sink output

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • Structured Streaming
    • None

    Description

      SPARK-20568 added options to file streaming source to clean up processed files. However, when applying these options to a directory that was written by a file streaming sink, it will make the directory not queryable any more because we delete files from the directory but they are still tracked by file sink logs.

      I think we should block the options if the input source is a file streaming sink path (has "_spark_metadata" folder).

      Attachments

        Issue Links

          Activity

            People

              kabhwan Jungtaek Lim
              zsxwing Shixiong Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: