Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Only one of the properties are allowed [oozie.wf.rerun.skip.nodes OR oozie.wf.rerun.failnodes]
Reproduction:
1. Create a workflow with more than 1 node. Eg: Fork - with three parallel shell actions. Make sure one of them fails
2. Rerun with 'oozie.wf.rerun.failnodes' set.
3. Rerun again with 'oozie.wf.rerun.skip.nodes' and check 'Skip all successful nodes'.
You will get the following error.
Error: E0404 : E0404: Only one of the properties are allowed [oozie.wf.rerun.skip.nodes OR oozie.wf.rerun.failnodes]
When a user reruns a workflow job with oozie.wf.rerun.failnode=true and if the job fails in subsequent steps, we do not have an option to resubmit the workflow using oozie.wf.rerun.skip.node=action1,action2 to allow submission from predecessor steps.
Currently, once the workflow fails and one of the rerun options is used for job rerun it gets merged and there is no way to override like regular oozie configurations or variables.
We have a few options:
1. If fail.nodes and skip.nodes are specified at the same time (or one of them was carried over from a previous wf run), we can add
union
{skip.nodes}2. Add a way to remove properties (this is also is potentially helpful for other use cases)
3. The "newest" property (oozie.wf.rerun.skip.nodes or oozie.wf.rerun.failnodes) takes priority and the previous is ignored
4. Make oozie.wf.rerun.skip.nodes or oozie.wf.rerun.failnodes somehow not persist in the DB
Part of this JIRA would be to figure out which is the best option.
Attachments
Issue Links
- relates to
-
OOZIE-3265 Properties RERUN_FAIL_NODES and RERUN_SKIP_NODES should be able to appear together
- Closed