Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
When exactly-once job publishing is enabled `FsRenameCommitStep` is used to ensure that the files make it to the destination. However, it doesn't result in the same state as the method used when exactly-once job publishing is disabled: `ParallelRunner.movePath`. The `ParallelRunner.movePath` method both moves the path and sets the group, but `FsRenameCommitStep.execute` only moves the file.
-
-
- ParallelRunner.movePath
-
``` java
if (fs.exists(src)) {
HadoopUtils.movePath(fs, src, dstFs, dst, overwrite, dstFs.getConf());
if (group.isPresent())
}
```
-
-
- FsRenameCommitStep.execute
-
``` java
log.info(String.format(Moving %s to %s, this.srcPath, this.dstPath));
HadoopUtils.movePath(this.srcFs, this.srcPath, this.dstFs, this.dstPath, this.dstFs.getConf());
```
Github Url : https://github.com/linkedin/gobblin/issues/781
Github Reporter : jbaranick
Github Created At : 2016-03-01T16:44:23Z
Github Updated At : 2017-01-12T04:44:04Z
Comments
jbaranick wrote on 2016-03-02T13:52:27Z : @zliu41 you might want to look at this one
Github Url : https://github.com/linkedin/gobblin/issues/781#issuecomment-191246073
zliu41 wrote on 2016-03-02T16:20:10Z : Good catch. It's an easy fix but we'll probably do some surgery to exactly-once in the next few weeks so this will be fixed as part of that.
Github Url : https://github.com/linkedin/gobblin/issues/781#issuecomment-191310800