Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-5689

CDC fails in Deltastreamer

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.13.0
    • None

    Description

      After enabling CDC, Deltastreamer fails to ingest data (0.13.0-rc1):

      spark-submit 
      --master yarn 
      --jars /mnt1/hudi-jars/hudi-spark-bundle.jar,/mnt1/hudi-jars/hudi-utilities-slim-bundle.jar 
      --deploy-mode cluster 
      --conf spark.serializer=org.apache.spark.serializer.KryoSerializer 
      --conf spark.sql.avro.datetimeRebaseModeInRead=CORRECTED 
      --conf spark.sql.avro.datetimeRebaseModeInWrite=CORRECTED 
      --conf spark.sql.parquet.datetimeRebaseModeInRead=CORRECTED 
      --conf spark.sql.parquet.datetimeRebaseModeInWrite=CORRECTED 
      --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer /mnt1/hudi-jars/hudi-utilities-slim-bundle.jar 
      --table-type COPY_ON_WRITE 
      --source-ordering-field ts 
      --source-class org.apache.hudi.utilities.sources.ParquetDFSSource 
      --target-base-path <base_path>
      --target-table emr 
      --payload-class org.apache.hudi.common.model.AWSDmsAvroPayload 
      --hoodie-conf hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.SimpleKeyGenerator 
      --hoodie-conf hoodie.datasource.write.recordkey.field=_id 
      --hoodie-conf hoodie.table.cdc.enabled=true 
      --hoodie-conf hoodie.table.cdc.supplemental.logging.mode=cdc_data_before_after 
      --hoodie-conf hoodie.datasource.write.partitionpath.field=partition 
      --hoodie-conf hoodie.deltastreamer.source.dfs.root=<source_path>
      23/02/01 22:37:12 ERROR Client: Application diagnostics message: User class threw exception: org.apache.hudi.exception.HoodieException: Commit 20230201223554790 failed and rolled-back !
      	at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:740)
      	at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:393)
      	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$2(HoodieDeltaStreamer.java:206)
      	at org.apache.hudi.common.util.Option.ifPresent(Option.java:97)
      	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:204)
      	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:573)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:742)
      
      Exception in thread "main" org.apache.spark.SparkException: Application application_1675271857569_0003 finished with failed status
      	at org.apache.spark.deploy.yarn.Client.run(Client.scala:1354)
      	at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1776)
      	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1006)
      	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
      	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
      	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
      	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1095)
      	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1104)
      	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 

       

      Attachments

        Issue Links

          Activity

            People

              xushiyan Shiyan Xu
              guoyihua Ethan Guo (this is the old account; please use "yihua")
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: