Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27589 Spark file source V2
  3. SPARK-27132

Improve file source V2 framework

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • SQL
    • None

    Description

      During the migration of CSV V2, I find that we can improve the file source v2 framework by:
      1. check duplicated column names in both read and write
      2. Not all the file sources support filter push down. So remove `SupportsPushDownFilters` from FileScanBuilder
      3. The method `isSplitable` might require data source options. Add a new member `options` to FileScan.

      Attachments

        Issue Links

          Activity

            People

              Gengliang.Wang Gengliang Wang
              Gengliang.Wang Gengliang Wang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: