Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19715

Option to Strip Paths in FileSource

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.0
    • 2.2.0
    • Structured Streaming
    • None

    Description

      Today, we compare the whole path when deciding if a file is new in the FileSource for structured streaming. However, this cause cause false negatives in the case where the path has changed in a cosmetic way (i.e. changing s3n to s3a). We should add an option fileNameOnly that causes the new file check to be based only on the filename (but still store the whole path in the log).

      Attachments

        Activity

          People

            lwlin Liwei Lin
            marmbrus Michael Armbrust
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: