Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-2476

When one of the action from fork fails with transient error, WF never joins.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.3.0
    • Component/s: None
    • Labels:
      None

      Description

      Noticed multiple time in our production.
      If one the action in fork fail with a transient error ( but succeeded after few retries), they never join.

      This happens when on the action is fork fails to submit a job.
      Oozie queues command as queue(this, retryDelayMillis) on transient error. ActionStartXCommand doesn't load job if its is not null.
      Before ActionStartXCommand runs again, other actions have already started which has modified job info. ActionStartXCommand still contains old info, which writes to DB and we miss some workflow instance data.

        Activity

        Hide
        rkanter Robert Kanter added a comment -

        Closing issue; Oozie 4.3.0 is released.

        Show
        rkanter Robert Kanter added a comment - Closing issue; Oozie 4.3.0 is released.
        Hide
        rohini Rohini Palaniswamy added a comment -

        Robert Kanter has asked about a unit test and I missed it. Purshotam Shah, can you take a look at that?

        Show
        rohini Rohini Palaniswamy added a comment - Robert Kanter has asked about a unit test and I missed it. Purshotam Shah , can you take a look at that?
        Hide
        puru Purshotam Shah added a comment -

        Thanks Rohini for review. Committed to trunk.

        Show
        puru Purshotam Shah added a comment - Thanks Rohini for review. Committed to trunk.
        Hide
        rohini Rohini Palaniswamy added a comment -

        +1

        Show
        rohini Rohini Palaniswamy added a comment - +1
        Hide
        hadoopqa Hadoop QA added a comment -

        Testing JIRA OOZIE-2476

        Cleaning local git workspace

        ----------------------------

        +1 PATCH_APPLIES
        +1 CLEAN
        -1 RAW_PATCH_ANALYSIS
        . +1 the patch does not introduce any @author tags
        . +1 the patch does not introduce any tabs
        . +1 the patch does not introduce any trailing spaces
        . +1 the patch does not introduce any line longer than 132
        . -1 the patch does not add/modify any testcase
        +1 RAT
        . +1 the patch does not seem to introduce new RAT warnings
        +1 JAVADOC
        . +1 the patch does not seem to introduce new Javadoc warnings
        +1 COMPILE
        . +1 HEAD compiles
        . +1 patch compiles
        . +1 the patch does not seem to introduce new javac warnings
        +1 BACKWARDS_COMPATIBILITY
        . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
        . +1 the patch does not modify JPA files
        -1 TESTS
        . Tests run: 1768
        . Tests failed: 1
        . Tests errors: 0

        . The patch failed the following testcases:

        . testCoordStatus_Failed(org.apache.oozie.command.coord.TestCoordChangeXCommand)

        +1 DISTRO
        . +1 distro tarball builds with the patch

        ----------------------------
        -1 Overall result, please check the reported -1(s)

        The full output of the test-patch run is available at

        . https://builds.apache.org/job/oozie-trunk-precommit-build/2795/

        Show
        hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2476 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN -1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . +1 the patch does not introduce any line longer than 132 . -1 the patch does not add/modify any testcase +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS . Tests run: 1768 . Tests failed: 1 . Tests errors: 0 . The patch failed the following testcases: . testCoordStatus_Failed(org.apache.oozie.command.coord.TestCoordChangeXCommand) +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2795/
        Hide
        rkanter Robert Kanter added a comment -

        The approach seems good to me. Would it be possible to create a unit test?

        Show
        rkanter Robert Kanter added a comment - The approach seems good to me. Would it be possible to create a unit test?
        Hide
        hadoopqa Hadoop QA added a comment -

        Testing JIRA OOZIE-2476

        Cleaning local git workspace

        ----------------------------

        +1 PATCH_APPLIES
        +1 CLEAN
        -1 RAW_PATCH_ANALYSIS
        . +1 the patch does not introduce any @author tags
        . +1 the patch does not introduce any tabs
        . +1 the patch does not introduce any trailing spaces
        . +1 the patch does not introduce any line longer than 132
        . -1 the patch does not add/modify any testcase
        +1 RAT
        . +1 the patch does not seem to introduce new RAT warnings
        +1 JAVADOC
        . +1 the patch does not seem to introduce new Javadoc warnings
        +1 COMPILE
        . +1 HEAD compiles
        . +1 patch compiles
        . +1 the patch does not seem to introduce new javac warnings
        +1 BACKWARDS_COMPATIBILITY
        . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
        . +1 the patch does not modify JPA files
        -1 TESTS
        . Tests run: 1766
        . Tests failed: 2
        . Tests errors: 0

        . The patch failed the following testcases:

        . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration)
        . testLastOnlyMaterialization(org.apache.oozie.command.coord.TestCoordMaterializeTransitionXCommand)

        +1 DISTRO
        . +1 distro tarball builds with the patch

        ----------------------------
        -1 Overall result, please check the reported -1(s)

        The full output of the test-patch run is available at

        . https://builds.apache.org/job/oozie-trunk-precommit-build/2780/

        Show
        hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2476 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN -1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . +1 the patch does not introduce any line longer than 132 . -1 the patch does not add/modify any testcase +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS . Tests run: 1766 . Tests failed: 2 . Tests errors: 0 . The patch failed the following testcases: . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration) . testLastOnlyMaterialization(org.apache.oozie.command.coord.TestCoordMaterializeTransitionXCommand) +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2780/

          People

          • Assignee:
            puru Purshotam Shah
            Reporter:
            puru Purshotam Shah
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development