Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15985

Reduce runtime overhead of a program that reads an primitive array in Dataset

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • SQL
    • None

    Description

      When a program read an array in Dataset, the code generator create some copy operations. If an array is for primitive type, there are some opportunities for optimizations in generated code to reduce runtime overhead.

      val ds = Seq(Array(1.0, 2.0, 3.0), Array(4.0, 5.0, 6.0)).toDS()
      ds.map(p => {
           var s = 0.0
           for (i <- 0 to 2) { s += p(i) }
           s
         }).show
      

      Attachments

        Activity

          People

            kiszk Kazuaki Ishizaki
            kiszk Kazuaki Ishizaki
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: