Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32906

Struct field names should not change after normalizing floats

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 3.0.2, 3.1.0
    • Fix Version/s: 3.0.2, 3.1.0
    • Component/s: SQL
    • Labels:
      None

      Description

      This ticket aims at fixing a minor bug when normalizing floats for struct types;

      scala> import org.apache.spark.sql.execution.aggregate.HashAggregateExec
      scala> val df = Seq(Tuple1(Tuple1(-0.0d)), Tuple1(Tuple1(0.0d))).toDF("k")
      scala> val agg = df.distinct()
      scala> agg.explain()
      == Physical Plan ==
      *(2) HashAggregate(keys=[k#40], functions=[])
      +- Exchange hashpartitioning(k#40, 200), true, [id=#62]
         +- *(1) HashAggregate(keys=[knownfloatingpointnormalized(if (isnull(k#40)) null else named_struct(col1, knownfloatingpointnormalized(normalizenanandzero(k#40._1)))) AS k#40], functions=[])
            +- *(1) LocalTableScan [k#40]
      
      scala> val aggOutput = agg.queryExecution.sparkPlan.collect { case a: HashAggregateExec => a.output.head }
      scala> aggOutput.foreach { attr => println(attr.prettyJson) }
      ### Final Aggregate ###
      [ {
        "class" : "org.apache.spark.sql.catalyst.expressions.AttributeReference",
        "num-children" : 0,
        "name" : "k",
        "dataType" : {
          "type" : "struct",
          "fields" : [ {
            "name" : "_1",
                      ^^^
            "type" : "double",
            "nullable" : false,
            "metadata" : { }
          } ]
        },
        "nullable" : true,
        "metadata" : { },
        "exprId" : {
          "product-class" : "org.apache.spark.sql.catalyst.expressions.ExprId",
          "id" : 40,
          "jvmId" : "a824e83f-933e-4b85-a1ff-577b5a0e2366"
        },
        "qualifier" : [ ]
      } ]
      
      ### Partial Aggregate ###
      [ {
        "class" : "org.apache.spark.sql.catalyst.expressions.AttributeReference",
        "num-children" : 0,
        "name" : "k",
        "dataType" : {
          "type" : "struct",
          "fields" : [ {
            "name" : "col1",
                      ^^^^
            "type" : "double",
            "nullable" : true,
            "metadata" : { }
          } ]
        },
        "nullable" : true,
        "metadata" : { },
        "exprId" : {
          "product-class" : "org.apache.spark.sql.catalyst.expressions.ExprId",
          "id" : 40,
          "jvmId" : "a824e83f-933e-4b85-a1ff-577b5a0e2366"
        },
        "qualifier" : [ ]
      } ]
      

        Attachments

          Activity

            People

            • Assignee:
              maropu Takeshi Yamamuro
              Reporter:
              maropu Takeshi Yamamuro
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: