Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
gobblin-on-temporal execution has been failing for other job types than iceberg-distcp (which uses `CopySource`). in particular Commit fails with:
java.lang.IllegalArgumentException: Missing required property writer.output.dir at com.google.common.base.Preconditions.checkArgument(Preconditions.java:122) at org.apache.gobblin.util.WriterUtils.getWriterOutputDir(WriterUtils.java:121) at org.apache.gobblin.publisher.BaseDataPublisher.publishData(BaseDataPublisher.java:390) at org.apache.gobblin.publisher.BaseDataPublisher.publishMultiTaskData(BaseDataPublisher.java:379) at org.apache.gobblin.publisher.BaseDataPublisher.publishData(BaseDataPublisher.java:366) at org.apache.gobblin.publisher.DataPublisher.publish(DataPublisher.java:81) at org.apache.gobblin.runtime.SafeDatasetCommit.commitDataset(SafeDatasetCommit.java:260) at org.apache.gobblin.runtime.SafeDatasetCommit.call(SafeDatasetCommit.java:168) at org.apache.gobblin.runtime.SafeDatasetCommit.call(SafeDatasetCommit.java:64) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
this is odd because that same prop had already been used prior to commit, while processing the `WorkUnit`! moreover logging shows it to be present within the `JobState`
anyway, even when using a private build that hard-coded that property, this later error arises:
Caused by: java.lang.IllegalArgumentException: Can not create a Path from a null string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:159)
at org.apache.hadoop.fs.Path.<init>(Path.java:175)
at org.apache.hadoop.fs.Path.<init>(Path.java:110)
at org.apache.gobblin.runtime.FsDatasetStateStore.sanitizeDatasetStatestoreNameFromDatasetURN(FsDatasetStateStore.java:175)
at org.apache.gobblin.runtime.FsDatasetStateStore.persistDatasetState(FsDatasetStateStore.java:386)
at org.apache.gobblin.runtime.FsDatasetStateStore.persistDatasetState(FsDatasetStateStore.java:90)
at org.apache.gobblin.runtime.SafeDatasetCommit.persistDatasetState(SafeDatasetCommit.java:418)
at org.apache.gobblin.runtime.SafeDatasetCommit.call(SafeDatasetCommit.java:191)
... 8 more