Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26824

Streaming queries may store checkpoint data in a wrong directory

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0, 2.1.0, 2.2.0, 2.3.0, 2.4.0
    • Fix Version/s: 3.0.0
    • Component/s: Structured Streaming
    • Labels:
    • Target Version/s:
    • Docs Text:
      Hide
      Earlier version of Spark incorrectly escaped paths when writing out checkpoints and "_spark_metadata" for structured streaming. Queries affected by this issue will fail when running in Spark 3.0. It will report an instruction about how to migrate your queries.
      Show
      Earlier version of Spark incorrectly escaped paths when writing out checkpoints and "_spark_metadata" for structured streaming. Queries affected by this issue will fail when running in Spark 3.0. It will report an instruction about how to migrate your queries.

      Description

      When a user specifies a checkpoint location containing special chars that need to be escaped in a path, the streaming query will store checkpoint in a wrong place. For example, if you use "/chk chk", the metadata will be stored in "/chk%20chk". File sink's "_spark_metadata" directory has the same issue.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                zsxwing Shixiong Zhu
                Reporter:
                zsxwing Shixiong Zhu
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: