Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16213

Reduce runtime overhead of a program that creates an primitive array in DataFrame

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.2.0
    • SQL
    • None

    Description

      Reduce runtime overhead of a program that creates an primitive array in DataFrame

      When a program creates an array in DataFrame, the code generator creates boxing operations. If an array is for primitive type, there are some opportunities for optimizations in generated code to reduce runtime overhead.

      Here is a simple example that has generated code with boxing operation

      val df = sparkContext.parallelize(Seq(0.0d, 1.0d), 1).toDF
      df.selectExpr("Array(value + 1.1d, value + 2.2d)").show
      

      Attachments

        Issue Links

          Activity

            People

              kiszk Kazuaki Ishizaki
              kiszk Kazuaki Ishizaki
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: