Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-43343

Spark Streaming is not able to read a .txt file whose name has [] special character

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.4.0
    • 3.5.0
    • Structured Streaming
    • None

    Description

      • For example, If a directory contains a following file:
        /path/abc[123]
        and users would load spark.readStream.format("text").load("/path") as stream input. It throws an exception, saying no matching path /path/abc[123]. Spark thinks abc[123] is a regex that only matches file named abc1, abc2 and abc3.
      • Upon investigation this is due to how we getBatch in the FileStreamSource. In `FileStreamSource` we already check file pattern matching and find all match file names. However, in DataSource we check for glob characters again and try to expend it here.

      Attachments

        Activity

          People

            siying Siying Dong
            siying Siying Dong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: