Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20773

ParquetWriteSupport.writeFields is quadratic in number of fields

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.1.1
    • 2.1.2, 2.2.0
    • SQL

    Description

      The writeFields method in ParquetWriteSupport uses Seq.apply to select all elements. Since the fieldWriters object is a List, this is a quadratic operation.

      See line 123: https://github.com/apache/spark/blob/ac1ab6b9db188ac54c745558d57dd0a031d0b162/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetWriteSupport.scala

      Attachments

        Activity

          People

            tpoterba Tim Poterba
            tpoterba Tim Poterba
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 10m
                10m
                Remaining:
                Remaining Estimate - 10m
                10m
                Logged:
                Time Spent - Not Specified
                Not Specified