Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Example:
val tableName = generateTableName // Create table spark.sql( s""" |create table $tableName ( | id int, | name string, | price double, | ts long, | dt string |) using hudi | location '${tmp.getCanonicalPath}/$tableName' | options ( | primaryKey ='id', | preCombineField = 'ts' | ) """.stripMargin) spark.sql( s""" |merge into $tableName as t0 |using ( | select 1 as id, '2021-05-05' as dt, 1002 as ts, 97 as price, 'a1' as name union all | select 1 as id, '2021-05-05' as dt, 1003 as ts, 98 as price, 'a2' as name union all | select 2 as id, '2021-05-05' as dt, 1001 as ts, 99 as price, 'a3' as name | ) as s0 |on t0.id = s0.id |when matched then update set * |when not matched then insert * |""".stripMargin) spark.sql(s"select id, name, price, ts, dt from $tableName").show()
Fow now,the result is:
----------------
id | name | price | ts | dt |
----------------
2 | 2021-05-05 | 99.0 | 99 | a3 |
1 | 2021-05-05 | 98.0 | 98 | a2 |
----------------
When the order of the column types of souceTable is different from that of the column types of targetTable
spark.sql( s""" |merge into ${tableName} as t0 |using ( | select 1 as id, 'a1' as name, 1002 as ts, '2021-05-05' as dt, 97 as price union all | select 1 as id, 'a2' as name, 1003 as ts, '2021-05-05' as dt, 98 as price union all | select 2 as id, 'a3' as name, 1001 as ts, '2021-05-05' as dt, 99 as price | ) as s0 |on t0.id = s0.id |when matched then update set * |when not matched then insert * |""".stripMargin)
It will throw an exception:
[ERROR] 2021-08-05 21:48:53,941 org.apache.hudi.io.HoodieWriteHandle - Error writing record HoodieRecord{key=HoodieKey { recordKey=id:2 partitionPath=}, currentLocation='null', newLocation='null'} java.lang.RuntimeException: Error in execute expression: org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer. Expressions is: [boundreference() AS `id` boundreference() AS `name` CAST(boundreference() AS `price` AS DOUBLE) CAST(boundreference() AS `ts` AS BIGINT) CAST(boundreference() AS `dt` AS STRING)] CodeBody is: { ...... Caused by: java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.IntegerCaused by: java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer at org.apache.hudi.sql.payload.ExpressionPayloadEvaluator_366797ae_4c30_4862_8222_7be486ede4f8.eval(Unknown Source) at org.apache.spark.sql.hudi.command.payload.ExpressionPayload.org$apache$spark$sql$hudi$command$payload$ExpressionPayload$$evaluate(ExpressionPayload.scala:258) ... 18 more
Attachments
Issue Links
- links to