Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18393

DataFrame pivot output column names should respect aliases

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None

      Description

      For example

      val df = spark.range(100).selectExpr("id % 5 as x", "id % 2 as a", "id as b")
      df
        .groupBy('x)
        .pivot("a", Seq(0, 1))
        .agg(expr("sum(b)").as("blah"), expr("count(b)").as("foo"))
        .show()
      +---+--------------------+---------------------+--------------------+---------------------+
      |  x|0_sum(`b`) AS `blah`|0_count(`b`) AS `foo`|1_sum(`b`) AS `blah`|1_count(`b`) AS `foo`|
      +---+--------------------+---------------------+--------------------+---------------------+
      |  0|                 450|                   10|                 500|                   10|
      |  1|                 510|                   10|                 460|                   10|
      |  3|                 530|                   10|                 480|                   10|
      |  2|                 470|                   10|                 520|                   10|
      |  4|                 490|                   10|                 540|                   10|
      +---+--------------------+---------------------+--------------------+---------------------+
      

      The column names here are quite hard to read. Ideally we would respect the aliases and generate column names like 0_blah, 0_foo, 1_blah, 1_foo instead.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ekhliang Eric Liang
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: