Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15918

unionAll returns wrong result when two dataframes has schema in different order

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 1.6.1
    • None
    • SQL
    • None
    • CentOS

    Description

      On applying unionAll operation between A and B dataframes, they both has same schema but in different order and hence the result has column value mapping changed.

      Repro:

      A.show()
      +---+--------+-------+------+------+-----+----+-------+------+-------+-------+-----+
      |tag|year_day|tm_hour|tm_min|tm_sec|dtype|time|tm_mday|tm_mon|tm_yday|tm_year|value|
      +---+--------+-------+------+------+-----+----+-------+------+-------+-------+-----+
      +---+--------+-------+------+------+-----+----+-------+------+-------+-------+-----+
      
      B.show()
      +-----+-------------------+----------+-------+-------+------+------+------+-------+-------+------+--------+
      |dtype|                tag|      time|tm_hour|tm_mday|tm_min|tm_mon|tm_sec|tm_yday|tm_year| value|year_day|
      +-----+-------------------+----------+-------+-------+------+------+------+-------+-------+------+--------+
      |    F|C_FNHXUT701Z.CNSTLO|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F|C_FNHXUDP713.CNSTHI|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F| C_FNHXUT718.CNSTHI|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F|C_FNHXUT703Z.CNSTLO|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F|C_FNHXUR716A.CNSTLO|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F|C_FNHXUT803Z.CNSTHI|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F| C_FNHXUT728.CNSTHI|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F| C_FNHXUR806.CNSTHI|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      +-----+-------------------+----------+-------+-------+------+------+------+-------+-------+------+--------+
      
      A = A.unionAll(B)
      A.show()
      +---+-------------------+----------+------+------+-----+----+-------+------+-------+-------+---------+
      |tag|           year_day|   tm_hour|tm_min|tm_sec|dtype|time|tm_mday|tm_mon|tm_yday|tm_year|    value|
      +---+-------------------+----------+------+------+-----+----+-------+------+-------+-------+---------+
      |  F|C_FNHXUT701Z.CNSTLO|1443790800|    13|     2|    0|  10|      0|   275|   2015| 1.2345|2015275.0|
      |  F|C_FNHXUDP713.CNSTHI|1443790800|    13|     2|    0|  10|      0|   275|   2015| 1.2345|2015275.0|
      |  F| C_FNHXUT718.CNSTHI|1443790800|    13|     2|    0|  10|      0|   275|   2015| 1.2345|2015275.0|
      |  F|C_FNHXUT703Z.CNSTLO|1443790800|    13|     2|    0|  10|      0|   275|   2015| 1.2345|2015275.0|
      |  F|C_FNHXUR716A.CNSTLO|1443790800|    13|     2|    0|  10|      0|   275|   2015| 1.2345|2015275.0|
      |  F|C_FNHXUT803Z.CNSTHI|1443790800|    13|     2|    0|  10|      0|   275|   2015| 1.2345|2015275.0|
      |  F| C_FNHXUT728.CNSTHI|1443790800|    13|     2|    0|  10|      0|   275|   2015| 1.2345|2015275.0|
      |  F| C_FNHXUR806.CNSTHI|1443790800|    13|     2|    0|  10|      0|   275|   2015| 1.2345|2015275.0|
      +---+-------------------+----------+------+------+-----+----+-------+------+-------+-------+---------+
      

      On changing the schema of A according to B and doing unionAll works fine

      C = A.select("dtype","tag","time","tm_hour","tm_mday","tm_min",”tm_mon”,"tm_sec","tm_yday","tm_year","value","year_day")
      
      A = C.unionAll(B)
      A.show()
      
      +-----+-------------------+----------+-------+-------+------+------+------+-------+-------+------+--------+
      |dtype|                tag|      time|tm_hour|tm_mday|tm_min|tm_mon|tm_sec|tm_yday|tm_year| value|year_day|
      +-----+-------------------+----------+-------+-------+------+------+------+-------+-------+------+--------+
      |    F|C_FNHXUT701Z.CNSTLO|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F|C_FNHXUDP713.CNSTHI|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F| C_FNHXUT718.CNSTHI|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F|C_FNHXUT703Z.CNSTLO|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F|C_FNHXUR716A.CNSTLO|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F|C_FNHXUT803Z.CNSTHI|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F| C_FNHXUT728.CNSTHI|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      |    F| C_FNHXUR806.CNSTHI|1443790800|     13|      2|     0|    10|     0|    275|   2015|1.2345| 2015275|
      +-----+-------------------+----------+-------+-------+------+------+------+-------+-------+------+--------+
      
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              prabhujoseph Prabhu Joseph
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: