Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16032 Audit semantics of various insertion operations related to partitioned tables
  3. SPARK-16033

DataFrameWriter.partitionBy() can't be used together with DataFrameWriter.insertInto()

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.0.0
    • SQL
    • None

    Description

      When inserting into an existing partitioned table, partitioning columns should always be determined by catalog metadata of the existing table to be inserted. Extra partitionBy() calls don't make sense, and mess up existing data because newly inserted data may have wrong partitioning directory layout.

      Attachments

        Activity

          People

            lian cheng Cheng Lian
            lian cheng Cheng Lian
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: