Details
-
Improvement
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
Description
This exception may be thrown during archive commit and clean operation,I'm not sure if there are other places where this exception might be thrown.
For archive commit ,the exception will be thrown when the .rollback file size is 0.
For clean operation, the exception will also be thrown when the size of `. clean` and `. clean.requested` and `. clean.inflight` files is 0.
I think we can filter out files of size 0 when we get clean and archive instances? in addition, it is unclear what caused the .file size to be 0. Is it due to high concurrency or abnormal exit of the program due to network reasons?
1. .rollback is 0
o.a.hudi.table.HoodieTimelineArchiveLog Failed to archive commits, .commit file: 20210927181216.rollback java.io.IOException: Not an Avro data file at org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:50) at org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeAvroMetadata(TimelineMetadataUtils.java:178) at org.apache.hudi.client.utils.MetadataConversionUtils.createMetaWrapper(MetadataConversionUtils.java:103) at org.apache.hudi.table.HoodieTimelineArchiveLog.convertToAvroRecord(HoodieTimelineArchiveLog.java:341) at org.apache.hudi.table.HoodieTimelineArchiveLog.archive(HoodieTimelineArchiveLog.java:305) at org.apache.hudi.table.HoodieTimelineArchiveLog.archiveIfRequired(HoodieTimelineArchiveLog.java:128) at org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:439) at org.apache.hudi.client.HoodieJavaWriteClient.postWrite(HoodieJavaWriteClient.java:187) at org.apache.hudi.client.HoodieJavaWriteClient.insert(HoodieJavaWriteClient.java:129) at org.apache.nifi.processors.javaHudi.JavaHudi.write(JavaHudi.java:427) at org.apache.nifi.processors.javaHudi.JavaHudi.onTrigger(JavaHudi.java:331) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1166) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:208) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
2. . clean is 0
org.apache.hudi.exception.HoodieIOException: Failed to schedule clean operation at org.apache.hudi.table.action.clean.BaseCleanPlanActionExecutor.requestClean(BaseCleanPlanActionExecutor.java:95) at org.apache.hudi.table.action.clean.BaseCleanPlanActionExecutor.requestClean(BaseCleanPlanActionExecutor.java:107) at org.apache.hudi.table.action.clean.BaseCleanPlanActionExecutor.execute(BaseCleanPlanActionExecutor.java:129) at org.apache.hudi.table.HoodieJavaCopyOnWriteTable.scheduleCleaning(HoodieJavaCopyOnWriteTable.java:182) at org.apache.hudi.client.AbstractHoodieWriteClient.scheduleTableServiceInternal(AbstractHoodieWriteClient.java:961) at org.apache.hudi.client.AbstractHoodieWriteClient.clean(AbstractHoodieWriteClient.java:653) at org.apache.hudi.client.AbstractHoodieWriteClient.clean(AbstractHoodieWriteClient.java:641) at org.apache.hudi.client.AbstractHoodieWriteClient.clean(AbstractHoodieWriteClient.java:672) at org.apache.hudi.client.AbstractHoodieWriteClient.autoCleanOnCommit(AbstractHoodieWriteClient.java:505) at org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:440) at org.apache.hudi.client.HoodieJavaWriteClient.postWrite(HoodieJavaWriteClient.java:187) at org.apache.hudi.client.HoodieJavaWriteClient.insert(HoodieJavaWriteClient.java:129) at org.apache.nifi.processors.javaHudi.JavaHudi.write(JavaHudi.java:405) at org.apache.nifi.processors.javaHudi.JavaHudi.onTrigger(JavaHudi.java:335) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1166) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:208) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Not an Avro data file at org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:50) at org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeAvroMetadata(TimelineMetadataUtils.java:178) at org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeHoodieCleanMetadata(TimelineMetadataUtils.java:152) at org.apache.hudi.table.action.clean.CleanPlanner.getPartitionPathsForCleanByCommits(CleanPlanner.java:150) at org.apache.hudi.table.action.clean.CleanPlanner.getPartitionPathsToClean(CleanPlanner.java:126) at org.apache.hudi.table.action.clean.BaseCleanPlanActionExecutor.requestClean(BaseCleanPlanActionExecutor.java:73) ... 24 more
3. `. clean.requested` and `. clean.inflight` is 0
o.a.h.t.a.clean.BaseCleanActionExecutor Failed to perform previous clean operation, instant: [==>20211011143809__clean__REQUESTED] org.apache.hudi.exception.HoodieIOException: Not an Avro data file at org.apache.hudi.table.action.clean.BaseCleanActionExecutor.runPendingClean(BaseCleanActionExecutor.java:87) at org.apache.hudi.table.action.clean.BaseCleanActionExecutor.lambda$execute$0(BaseCleanActionExecutor.java:137) at java.util.ArrayList.forEach(ArrayList.java:1257) at org.apache.hudi.table.action.clean.BaseCleanActionExecutor.execute(BaseCleanActionExecutor.java:134) at org.apache.hudi.table.HoodieJavaCopyOnWriteTable.clean(HoodieJavaCopyOnWriteTable.java:188) at org.apache.hudi.client.AbstractHoodieWriteClient.clean(AbstractHoodieWriteClient.java:660) at org.apache.hudi.client.AbstractHoodieWriteClient.clean(AbstractHoodieWriteClient.java:641) at org.apache.hudi.client.AbstractHoodieWriteClient.clean(AbstractHoodieWriteClient.java:672) at org.apache.hudi.client.AbstractHoodieWriteClient.autoCleanOnCommit(AbstractHoodieWriteClient.java:505) at org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:440) at org.apache.hudi.client.HoodieJavaWriteClient.postWrite(HoodieJavaWriteClient.java:187) at org.apache.hudi.client.HoodieJavaWriteClient.insert(HoodieJavaWriteClient.java:129) at org.apache.nifi.processors.javaHudi.JavaHudi.write(JavaHudi.java:401) at org.apache.nifi.processors.javaHudi.JavaHudi.onTrigger(JavaHudi.java:305) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1166) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:208) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Not an Avro data file at org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:50) at org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeAvroMetadata(TimelineMetadataUtils.java:178) at org.apache.hudi.common.util.CleanerUtils.getCleanerPlan(CleanerUtils.java:106) at org.apache.hudi.table.action.clean.BaseCleanActionExecutor.runPendingClean(BaseCleanActionExecutor.java:84) ... 24 common frames omitted