Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27716

Complete the transactions support for part of jdbc datasource operations.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 2.4.3
    • None
    • SQL

    Description

      With the jdbc datasource, we can save a rdd to the database.

      The comments for the function saveTable is that.

        /**
         * Saves the RDD to the database in a single transaction.
         */
        def saveTable(
            df: DataFrame,
            tableSchema: Option[StructType],
            isCaseSensitive: Boolean,
            options: JdbcOptionsInWrite)
      

      In fact, it is not true.

      The savePartition operation is in a single transaction but the saveTable operation is not in a single transaction.

      There are several cases of data transmission:

      case1: Append data to origin existed gptable.
      case2: Overwrite origin gptable, but the table is a cascadingTruncateTable, so we can not drop the gptable, we have to truncate it and append data.
      case3: Overwrite origin existed table and the table is not a cascadingTruncateTable, so we can drop it first.
      case4: For an unexisted table, create and transmit data.
      In this PR, I add a transactions support for case3 and case4.

      For case3 and case4, we can transmit the rdd to a temp table at first.

      We use an accumulator to record the suceessful savePartition operations.

      At last, we compare the value of accumulator with dataFrame's partitionNum.

      If all the savePartition operations are successful, we drop the origin table if it exists, then we alter the temp table rename to origin table.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hzfeiwang feiwang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: