Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
None
-
None
Description
After enabling CDC, Deltastreamer fails to ingest data (0.13.0-rc1):
spark-submit --master yarn --jars /mnt1/hudi-jars/hudi-spark-bundle.jar,/mnt1/hudi-jars/hudi-utilities-slim-bundle.jar --deploy-mode cluster --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.sql.avro.datetimeRebaseModeInRead=CORRECTED --conf spark.sql.avro.datetimeRebaseModeInWrite=CORRECTED --conf spark.sql.parquet.datetimeRebaseModeInRead=CORRECTED --conf spark.sql.parquet.datetimeRebaseModeInWrite=CORRECTED --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer /mnt1/hudi-jars/hudi-utilities-slim-bundle.jar --table-type COPY_ON_WRITE --source-ordering-field ts --source-class org.apache.hudi.utilities.sources.ParquetDFSSource --target-base-path <base_path> --target-table emr --payload-class org.apache.hudi.common.model.AWSDmsAvroPayload --hoodie-conf hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.SimpleKeyGenerator --hoodie-conf hoodie.datasource.write.recordkey.field=_id --hoodie-conf hoodie.table.cdc.enabled=true --hoodie-conf hoodie.table.cdc.supplemental.logging.mode=cdc_data_before_after --hoodie-conf hoodie.datasource.write.partitionpath.field=partition --hoodie-conf hoodie.deltastreamer.source.dfs.root=<source_path>
23/02/01 22:37:12 ERROR Client: Application diagnostics message: User class threw exception: org.apache.hudi.exception.HoodieException: Commit 20230201223554790 failed and rolled-back ! at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:740) at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:393) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$2(HoodieDeltaStreamer.java:206) at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:204) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:573) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:742) Exception in thread "main" org.apache.spark.SparkException: Application application_1675271857569_0003 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1354) at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1776) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1006) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1095) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1104) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Attachments
Issue Links
- links to