Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33068

Spark 2.3 vs Spark 1.6 collect_list giving different schema

    XMLWordPrintableJSON

    Details

    • Type: IT Help
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 2.3.4
    • Fix Version/s: None
    • Component/s: Spark Submit
    • Labels:
      None

      Description

      Hi,

      I am migrating from spark 1.6 to spark 2.3. However in collect_list I am getting different schema.

       

      val df_date_agg = df
          .groupBy($"a",$"b",$"c")
          .agg(sum($"d").alias("data1"),sum($"e").alias("data2"))
          .groupBy($"a")
          .agg(collect_list(array($"b",$"c",$"data1")).alias("final_data1"),
               collect_list(array($"b",$"c",$"data2")).alias("final_data2"))
      

      When I am running above line in spark 1.6 getting below schema

       

       

       |-- final_data1: array (nullable = true)
       |    |-- element: string (containsNull = true)
       |-- final_data2: array (nullable = true)
       |    |-- element: string (containsNull = true)
      

       

       

      but in spark 2.3 schema changed to 

       

      |-- final_data1: array (nullable = true)
       |    |-- element: array (containsNull = true)
       |    |    |-- element: string (containsNull = true)
       |-- final_data1: array (nullable = true)
       |    |-- element: array (containsNull = true)
       |    |    |-- element: string (containsNull = true)
      

       

       

      In Spark 1.6 array($"b",$"c",$"data1") is converting to string like this 

      '[2020-09-26, Ayush, 103.67]'
      

      In spark 2.3 it is converted to WrappedArray

      WrappedArray(2020-09-26, Ayush, 103.67)
      

      I want to keep my schema as it is Otherwise all the dependent codes have to change.

       

      Thanks

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              ayush_goyal Ayush Goyal
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: