Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-46482

Revert SPARK-43049 due to performance regression of using CLOB

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.5.0
    • None
    • SQL

    Description

      SPARK-43049 causes performance regression when writing string fields to an Oracle database due to strings written as CLOB instead of VARCHAR2. CLOB is known to have bad performance in Oracle so when creating a table and writing to it using Spark, the internal SQL statement that writes 20+ string fields would take at least 5x performance hit from the original patch (2 min vs 10+ min).

      I confirmed internally that running a job with the commit reverted brings back the original performance numbers.

       

       

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ivan.sadikov Ivan Sadikov
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: