Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-2397

LAST_ONLY and NONE don't properly handle READY actions

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 4.2.0
    • Fix Version/s: 4.3.0
    • Component/s: core
    • Labels:
      None

      Description

      When using LAST_ONLY or NONE, actions are supposed to be able to transition from READY to SKIPPED if the right criteria are met, but they don't. This is in contrast to the timeout feature, which does not.

      Here's a more detailed technical description of the problem:
      We handle LAST_ONLY in CoordMaterializeTransitionXCommand and CoordActionInputCheckXCommand. The former deals with materializing the actions and the behavior to set "old" actions to SKIPPED when materializing them. The latter deals with checking the input datasets for actions and the behavior to determine if a WAITING action is ready to transition to READY (deps are met) and all that entails, including changing status to READY and queuing a CoordActionReadyXCommand. If the deps are not met and the dataset is not there yet, it will queue itself at some delay. So, these only handle the materialization and WAITING states. However, LAST_ONLY is supposed to also do READY --> SKIPPED if it's condition is met (unlike TIMEDOUT, which can only come from WAITING; this additional difference should probably be called out in the docs).

      CoordActionReadyXCommand needs to be updated to handle LAST_ONLY. It currently treats LAST_ONLY the same as LIFO (via CoordJobGetReadyActionsJPAExecutor), where the order is the only difference from FIFO. After retrieving all READY actions, it should check if any meet their LAST_ONLY condition, and if so, queue a CoordActionSkipXCommand for them (maybe make a bulk version?) instead of a CoordActionStartXCommand.

      We have the same issue with NONE, which has similar behavior.

      1. OOZIE-2397.003.patch
        70 kB
        Robert Kanter
      2. OOZIE-2397.002.patch
        67 kB
        Robert Kanter
      3. OOZIE-2397.001.patch
        43 kB
        Robert Kanter

        Issue Links

          Activity

          Hide
          rkanter Robert Kanter added a comment -

          Closing issue; Oozie 4.3.0 is released.

          Show
          rkanter Robert Kanter added a comment - Closing issue; Oozie 4.3.0 is released.
          Hide
          rkanter Robert Kanter added a comment -

          Thanks for the reviews Puru.

          Committed to trunk!

          Show
          rkanter Robert Kanter added a comment - Thanks for the reviews Puru. Committed to trunk!
          Hide
          hadoopqa Hadoop QA added a comment -

          Testing JIRA OOZIE-2397

          Cleaning local git workspace

          ----------------------------

          +1 PATCH_APPLIES
          +1 CLEAN
          -1 RAW_PATCH_ANALYSIS
          . +1 the patch does not introduce any @author tags
          . +1 the patch does not introduce any tabs
          . +1 the patch does not introduce any trailing spaces
          . -1 the patch contains 8 line(s) longer than 132 characters
          . +1 the patch does adds/modifies 8 testcase(s)
          +1 RAT
          . +1 the patch does not seem to introduce new RAT warnings
          +1 JAVADOC
          . +1 the patch does not seem to introduce new Javadoc warnings
          +1 COMPILE
          . +1 HEAD compiles
          . +1 patch compiles
          . +1 the patch does not seem to introduce new javac warnings
          +1 BACKWARDS_COMPATIBILITY
          . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
          . +1 the patch does not modify JPA files
          -1 TESTS
          . Tests run: 1702
          . Tests failed: 5
          . Tests errors: 0

          . The patch failed the following testcases:

          . testSamplers(org.apache.oozie.util.TestMetricsInstrumentation)
          . testBundleStatusNotTransitionFromKilled(org.apache.oozie.service.TestStatusTransitService)
          . testBundleId(org.apache.oozie.servlet.TestBulkMonitorWebServiceAPI)
          . testNone(org.apache.oozie.command.coord.TestCoordActionInputCheckXCommandNonUTC)
          . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration)

          +1 DISTRO
          . +1 distro tarball builds with the patch

          ----------------------------
          -1 Overall result, please check the reported -1(s)

          The full output of the test-patch run is available at

          . https://builds.apache.org/job/oozie-trunk-precommit-build/2601/

          Show
          hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2397 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN -1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . -1 the patch contains 8 line(s) longer than 132 characters . +1 the patch does adds/modifies 8 testcase(s) +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS . Tests run: 1702 . Tests failed: 5 . Tests errors: 0 . The patch failed the following testcases: . testSamplers(org.apache.oozie.util.TestMetricsInstrumentation) . testBundleStatusNotTransitionFromKilled(org.apache.oozie.service.TestStatusTransitService) . testBundleId(org.apache.oozie.servlet.TestBulkMonitorWebServiceAPI) . testNone(org.apache.oozie.command.coord.TestCoordActionInputCheckXCommandNonUTC) . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration) +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2601/
          Hide
          puru Purshotam Shah added a comment -

          +1

          Show
          puru Purshotam Shah added a comment - +1
          Hide
          rkanter Robert Kanter added a comment -

          The 003 patch fixes end_of_duration based on Puru's comments on RB.

          Show
          rkanter Robert Kanter added a comment - The 003 patch fixes end_of_duration based on Puru's comments on RB.
          Hide
          hadoopqa Hadoop QA added a comment -

          Testing JIRA OOZIE-2397

          Cleaning local git workspace

          ----------------------------

          +1 PATCH_APPLIES
          +1 CLEAN
          -1 RAW_PATCH_ANALYSIS
          . +1 the patch does not introduce any @author tags
          . +1 the patch does not introduce any tabs
          . +1 the patch does not introduce any trailing spaces
          . -1 the patch contains 9 line(s) longer than 132 characters
          . +1 the patch does adds/modifies 8 testcase(s)
          +1 RAT
          . +1 the patch does not seem to introduce new RAT warnings
          +1 JAVADOC
          . +1 the patch does not seem to introduce new Javadoc warnings
          +1 COMPILE
          . +1 HEAD compiles
          . +1 patch compiles
          . +1 the patch does not seem to introduce new javac warnings
          +1 BACKWARDS_COMPATIBILITY
          . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
          . +1 the patch does not modify JPA files
          -1 TESTS
          . Tests run: 1702
          . Tests failed: 3
          . Tests errors: 0

          . The patch failed the following testcases:

          . testSamplers(org.apache.oozie.util.TestMetricsInstrumentation)
          . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration)
          . testCoordChangeConcurrency(org.apache.oozie.command.coord.TestCoordChangeXCommand)

          +1 DISTRO
          . +1 distro tarball builds with the patch

          ----------------------------
          -1 Overall result, please check the reported -1(s)

          The full output of the test-patch run is available at

          . https://builds.apache.org/job/oozie-trunk-precommit-build/2595/

          Show
          hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2397 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN -1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . -1 the patch contains 9 line(s) longer than 132 characters . +1 the patch does adds/modifies 8 testcase(s) +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS . Tests run: 1702 . Tests failed: 3 . Tests errors: 0 . The patch failed the following testcases: . testSamplers(org.apache.oozie.util.TestMetricsInstrumentation) . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration) . testCoordChangeConcurrency(org.apache.oozie.command.coord.TestCoordChangeXCommand) +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2595/
          Hide
          rkanter Robert Kanter added a comment -

          The 002 patch changes how the next action nominal time is computed as suggested by Purshotam Shah on RB. I also updated and cleaned up affected unit tests.

          test-patch is still broken, so I ran all tests starting with TestCoord* locally and they all passed.

          Show
          rkanter Robert Kanter added a comment - The 002 patch changes how the next action nominal time is computed as suggested by Purshotam Shah on RB. I also updated and cleaned up affected unit tests. test-patch is still broken, so I ran all tests starting with TestCoord* locally and they all passed.
          Hide
          rkanter Robert Kanter added a comment -

          test-patch test run is unfortunately still broken. The lines that are too long are due to SQL changes or docs changes.

          Show
          rkanter Robert Kanter added a comment - test-patch test run is unfortunately still broken. The lines that are too long are due to SQL changes or docs changes.
          Hide
          puru Purshotam Shah added a comment -

          Thanks Robert for putting patch. Will review it tomorrow.

          Show
          puru Purshotam Shah added a comment - Thanks Robert for putting patch. Will review it tomorrow.
          Hide
          hadoopqa Hadoop QA added a comment -

          Testing JIRA OOZIE-2397

          Cleaning local git workspace

          ----------------------------

          +1 PATCH_APPLIES
          +1 CLEAN
          -1 RAW_PATCH_ANALYSIS
          . +1 the patch does not introduce any @author tags
          . +1 the patch does not introduce any tabs
          . +1 the patch does not introduce any trailing spaces
          . -1 the patch contains 5 line(s) longer than 132 characters
          . +1 the patch does adds/modifies 7 testcase(s)
          +1 RAT
          . +1 the patch does not seem to introduce new RAT warnings
          +1 JAVADOC
          . +1 the patch does not seem to introduce new Javadoc warnings
          +1 COMPILE
          . +1 HEAD compiles
          . +1 patch compiles
          . +1 the patch does not seem to introduce new javac warnings
          +1 BACKWARDS_COMPATIBILITY
          . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
          . +1 the patch does not modify JPA files
          -1 TESTS - patch does not compile, cannot run testcases
          +1 DISTRO
          . +1 distro tarball builds with the patch

          ----------------------------
          -1 Overall result, please check the reported -1(s)

          The full output of the test-patch run is available at

          . https://builds.apache.org/job/oozie-trunk-precommit-build/2587/

          Show
          hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2397 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN -1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . -1 the patch contains 5 line(s) longer than 132 characters . +1 the patch does adds/modifies 7 testcase(s) +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS - patch does not compile, cannot run testcases +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2587/
          Hide
          rkanter Robert Kanter added a comment -

          Purshotam Shah, can you take a look at this?

          Show
          rkanter Robert Kanter added a comment - Purshotam Shah , can you take a look at this?
          Hide
          rkanter Robert Kanter added a comment -
          Show
          rkanter Robert Kanter added a comment - RB here: https://reviews.apache.org/r/40157/
          Hide
          rkanter Robert Kanter added a comment -

          The patch fixes the issue by doing what I said in the description. The patch has a number of changes, but it's mostly moving existing code, copying existing code, etc; there's very little "new" code in it. I also updated the docs to be more clear.

          Show
          rkanter Robert Kanter added a comment - The patch fixes the issue by doing what I said in the description. The patch has a number of changes, but it's mostly moving existing code, copying existing code, etc; there's very little "new" code in it. I also updated the docs to be more clear.
          Hide
          rkanter Robert Kanter added a comment -

          Oh, I didn't realize that. I'll close OOZIE-2274 as a duplicate of this one because this one has more details.

          I'm working on a fix that and just have the unit tests left to do. I should have it up by the end of the week.

          Show
          rkanter Robert Kanter added a comment - Oh, I didn't realize that. I'll close OOZIE-2274 as a duplicate of this one because this one has more details. I'm working on a fix that and just have the unit tests left to do. I should have it up by the end of the week.
          Hide
          puru Purshotam Shah added a comment -

          We noticed same in our prod, we had JIRA for this https://issues.apache.org/jira/browse/OOZIE-2274.

          Show
          puru Purshotam Shah added a comment - We noticed same in our prod, we had JIRA for this https://issues.apache.org/jira/browse/OOZIE-2274 .

            People

            • Assignee:
              rkanter Robert Kanter
              Reporter:
              rkanter Robert Kanter
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development