Details
-
Improvement
-
Status: Open
-
Blocker
-
Resolution: Unresolved
-
None
-
None
Description
Background:
We find that Spark-Hudi insert data will return a HoodieException: (Part -) field not found in record. Acceptable fields were :[uuid, name, price]
...... at org.apache.hudi.index.simple.HoodieSimpleIndex.fetchRecordLocationsForAffectedPartitions(HoodieSimpleIndex.java:142) at org.apache.hudi.index.simple.HoodieSimpleIndex.tagLocationInternal(HoodieSimpleIndex.java:113) at org.apache.hudi.index.simple.HoodieSimpleIndex.tagLocation(HoodieSimpleIndex.java:91) at org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:51) at org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:34) at org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:53) ... 52 more Caused by: org.apache.hudi.exception.HoodieException: (Part -) field not found in record. Acceptable fields were :[uuid, name, price] at org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:530) at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$write$11(HoodieSparkSqlWriter.scala:305) at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:194) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1509) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit time 20230317222153522 at org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:64)
Steps to Reproduce:
-- 1. create a table without preCombineKey CREATE TABLE default.test_hudi_default ( uuid int, name string, price double ) USING hudi; -- 2. config write operation to upsert set hoodie.datasource.write.operation=upsert; -- 3. insert data and exception occurs insert into default.test_hudi_default select 1, 'name1', 1.1;
Root Cause:
Hudi does not support upsert for table without preCombineKey, but this exception message may confuse the users.
Improvement:
We can check the user configured write operation and provide a more specific exception message, it will help user understand what's wrong immediately.
Attachments
Issue Links
- links to