Details
-
Bug
-
Status: Open
-
P3
-
Resolution: Unresolved
-
2.16.0, 2.18.0, 2.19.0, 2.22.0
-
None
-
None
-
Azure Databricks
Description
Bumping from Beam 2.13.0 to 2.16.0 and above we see broken pipelines running on spark/HDFS.
Platform: Azure databricks.
Beam 2.13.0 works fine, Have issues only after migrating to 2.16 and above, and only on large jobs (smaller jobs run fine)
Caused by: java.io.IOException: Unable to rename resource wasbs://****/output/npstand75k0727_1/np/.temp-beam-64a00562-5dcd-4bcd-9c5a-be7cff1231f3/483d5498-ed9c-46fd-b1ce-8647fa5c8a06 to wasbs://*****/output/npstand75k0727_1/np/confinements/part-00000-of-00001.txt. No further information provided by underlying filesystem.Caused by: java.io.IOException: Unable to rename resource wasbs://**/output/npstand75k0727_1/np/.temp-beam-64a00562-5dcd-4bcd-9c5a-be7cff1231f3/483d5498-ed9c-46fd-b1ce-8647fa5c8a06 to wasbs://***/output/npstand75k0727_1/np/confinements/part-00000-of-00001.txt. No further information provided by underlying filesystem. at org.apache.beam.sdk.io.hdfs.HadoopFileSystem.rename(HadoopFileSystem.java:287) at org.apache.beam.sdk.io.FileSystems.rename(FileSystems.java:327) at org.apache.beam.sdk.io.FileBasedSink$WriteOperation.moveToOutputFiles(FileBasedSink.java:755) at org.apache.beam.sdk.io.WriteFiles$FinalizeTempFileBundles$FinalizeFn.process(WriteFiles.java:850)