Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-2345

Parallel job submission for forked actions

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.3.0
    • Component/s: None
    • Labels:
      None

      Description

      We have few customers whose SLA is 8 min. They have around 30 actions. There are 25 actions in fork.
      Though forked action jobs runs parallely in the cluster, forked action job submission is sequential.
      Whenever NN is slow, job submission takes more time. Even if job submission is delay for 30 sec. Total WF delay will be ~12 min.

      1. OOZIE-2345-V3.patch
        41 kB
        Purshotam Shah
      2. OOZIE-2345-V4.patch
        69 kB
        Purshotam Shah
      3. OOZIE-2345-V6.patch
        65 kB
        Purshotam Shah
      4. OOZIE-2345-V7.patch
        65 kB
        Purshotam Shah
      5. OOZIE-2345-V8.patch
        67 kB
        Purshotam Shah

        Issue Links

          Activity

          Hide
          puru Purshotam Shah added a comment -

          Approach is to have a new command ForkedActionStartXCommand, which will acquire lock on wf action. It will just submit job and won't do any wf update. All Wf update will be done by signalXcommand.
          SignalXcommand will submit all ForkedActionStartXCommand and wait for them to complete.
          If any ForkedActionStartXCommand has failed, SignalXcommand will fail the workflow.
          In case of transient or user retry. ActionStartXcommand will be queued, which need to acquire lock of Wf.

          This feature can be enable/disable using oozie.workflow.parallel.fork.action.start

          Show
          puru Purshotam Shah added a comment - Approach is to have a new command ForkedActionStartXCommand, which will acquire lock on wf action. It will just submit job and won't do any wf update. All Wf update will be done by signalXcommand. SignalXcommand will submit all ForkedActionStartXCommand and wait for them to complete. If any ForkedActionStartXCommand has failed, SignalXcommand will fail the workflow. In case of transient or user retry. ActionStartXcommand will be queued, which need to acquire lock of Wf. This feature can be enable/disable using oozie.workflow.parallel.fork.action.start
          Hide
          rohini Rohini Palaniswamy added a comment -

          oozie.workflow.parallel.fork.action.start - Can we make this true by default in oozie-default.xml ?

          Change description If true, oozie will submit concurrent jobs for all actions in fork to Determines how Oozie processes starting of forked actions. If true, forked actions and their job submissions are done in parallel which is best for performance. If false, they are submitted sequentially which is the older approach

          Show
          rohini Rohini Palaniswamy added a comment - oozie.workflow.parallel.fork.action.start - Can we make this true by default in oozie-default.xml ? Change description If true, oozie will submit concurrent jobs for all actions in fork to Determines how Oozie processes starting of forked actions. If true, forked actions and their job submissions are done in parallel which is best for performance. If false, they are submitted sequentially which is the older approach
          Hide
          hadoopqa Hadoop QA added a comment -

          Testing JIRA OOZIE-2345

          Cleaning local git workspace

          ----------------------------

          +1 PATCH_APPLIES
          +1 CLEAN
          +1 RAW_PATCH_ANALYSIS
          . +1 the patch does not introduce any @author tags
          . +1 the patch does not introduce any tabs
          . +1 the patch does not introduce any trailing spaces
          . +1 the patch does not introduce any line longer than 132
          . +1 the patch does adds/modifies 3 testcase(s)
          +1 RAT
          . +1 the patch does not seem to introduce new RAT warnings
          +1 JAVADOC
          . +1 the patch does not seem to introduce new Javadoc warnings
          +1 COMPILE
          . +1 HEAD compiles
          . +1 patch compiles
          . +1 the patch does not seem to introduce new javac warnings
          +1 BACKWARDS_COMPATIBILITY
          . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
          . +1 the patch does not modify JPA files
          -1 TESTS
          . Tests run: 1693
          . Tests failed: 2
          . Tests errors: 0

          . The patch failed the following testcases:

          . testAdminInstrumentation(org.apache.oozie.client.TestOozieCLI)
          . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration)

          +1 DISTRO
          . +1 distro tarball builds with the patch

          ----------------------------
          -1 Overall result, please check the reported -1(s)

          The full output of the test-patch run is available at

          . https://builds.apache.org/job/oozie-trunk-precommit-build/2530/

          Show
          hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2345 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN +1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . +1 the patch does not introduce any line longer than 132 . +1 the patch does adds/modifies 3 testcase(s) +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS . Tests run: 1693 . Tests failed: 2 . Tests errors: 0 . The patch failed the following testcases: . testAdminInstrumentation(org.apache.oozie.client.TestOozieCLI) . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration) +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2530/
          Hide
          puru Purshotam Shah added a comment -

          oozie.workflow.parallel.fork.action.start - Can we make this true by default in oozie-default.xml ?

          I think we should keep it to false by default and if anyone has any issue he can turn it on.
          One caveat of approach that if one has small queue size and and there are lot of actions in fork job.
          Than signalxcommand will get stuck till all command completes (This include wait for queue to free up)

          Show
          puru Purshotam Shah added a comment - oozie.workflow.parallel.fork.action.start - Can we make this true by default in oozie-default.xml ? I think we should keep it to false by default and if anyone has any issue he can turn it on. One caveat of approach that if one has small queue size and and there are lot of actions in fork job. Than signalxcommand will get stuck till all command completes (This include wait for queue to free up)
          Hide
          hadoopqa Hadoop QA added a comment -

          Testing JIRA OOZIE-2345

          Cleaning local git workspace

          ----------------------------

          +1 PATCH_APPLIES
          +1 CLEAN
          +1 RAW_PATCH_ANALYSIS
          . +1 the patch does not introduce any @author tags
          . +1 the patch does not introduce any tabs
          . +1 the patch does not introduce any trailing spaces
          . +1 the patch does not introduce any line longer than 132
          . +1 the patch does adds/modifies 3 testcase(s)
          +1 RAT
          . +1 the patch does not seem to introduce new RAT warnings
          +1 JAVADOC
          . +1 the patch does not seem to introduce new Javadoc warnings
          +1 COMPILE
          . +1 HEAD compiles
          . +1 patch compiles
          . +1 the patch does not seem to introduce new javac warnings
          +1 BACKWARDS_COMPATIBILITY
          . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
          . +1 the patch does not modify JPA files
          -1 TESTS - patch does not compile, cannot run testcases
          +1 DISTRO
          . +1 distro tarball builds with the patch

          ----------------------------
          -1 Overall result, please check the reported -1(s)

          The full output of the test-patch run is available at

          . https://builds.apache.org/job/oozie-trunk-precommit-build/2531/

          Show
          hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2345 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN +1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . +1 the patch does not introduce any line longer than 132 . +1 the patch does adds/modifies 3 testcase(s) +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS - patch does not compile, cannot run testcases +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2531/
          Hide
          rohini Rohini Palaniswamy added a comment -

          One caveat of approach that if one has small queue size and and there are lot of actions in fork job. Than signalxcommand will get stuck till all command completes (This include wait for queue to free up)

          Default is 10K and we are running the biggest scale with just that setting. I don't think anyone would run into any issues. This is a very good performance optimization and should have it turned on by default for all.

          Show
          rohini Rohini Palaniswamy added a comment - One caveat of approach that if one has small queue size and and there are lot of actions in fork job. Than signalxcommand will get stuck till all command completes (This include wait for queue to free up) Default is 10K and we are running the biggest scale with just that setting. I don't think anyone would run into any issues. This is a very good performance optimization and should have it turned on by default for all.
          Hide
          hadoopqa Hadoop QA added a comment -

          Testing JIRA OOZIE-2345

          Cleaning local git workspace

          ----------------------------

          +1 PATCH_APPLIES
          +1 CLEAN
          +1 RAW_PATCH_ANALYSIS
          . +1 the patch does not introduce any @author tags
          . +1 the patch does not introduce any tabs
          . +1 the patch does not introduce any trailing spaces
          . +1 the patch does not introduce any line longer than 132
          . +1 the patch does adds/modifies 4 testcase(s)
          +1 RAT
          . +1 the patch does not seem to introduce new RAT warnings
          +1 JAVADOC
          . +1 the patch does not seem to introduce new Javadoc warnings
          -1 COMPILE
          . +1 HEAD compiles
          . +1 patch compiles
          . -1 the patch seems to introduce 2 new javac warning(s)
          +1 BACKWARDS_COMPATIBILITY
          . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
          . +1 the patch does not modify JPA files
          -1 TESTS
          . Tests run: 1694
          . Tests failed: 2
          . Tests errors: 0

          . The patch failed the following testcases:

          . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration)
          . testActionKillCommandDate(org.apache.oozie.command.coord.TestCoordActionsKillXCommand)

          +1 DISTRO
          . +1 distro tarball builds with the patch

          ----------------------------
          -1 Overall result, please check the reported -1(s)

          The full output of the test-patch run is available at

          . https://builds.apache.org/job/oozie-trunk-precommit-build/2542/

          Show
          hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2345 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN +1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . +1 the patch does not introduce any line longer than 132 . +1 the patch does adds/modifies 4 testcase(s) +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings -1 COMPILE . +1 HEAD compiles . +1 patch compiles . -1 the patch seems to introduce 2 new javac warning(s) +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS . Tests run: 1694 . Tests failed: 2 . Tests errors: 0 . The patch failed the following testcases: . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration) . testActionKillCommandDate(org.apache.oozie.command.coord.TestCoordActionsKillXCommand) +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2542/
          Hide
          rohini Rohini Palaniswamy added a comment -

          +1 Pending jenkins

          Show
          rohini Rohini Palaniswamy added a comment - +1 Pending jenkins
          Hide
          hadoopqa Hadoop QA added a comment -

          Testing JIRA OOZIE-2345

          Cleaning local git workspace

          ----------------------------

          +1 PATCH_APPLIES
          +1 CLEAN
          +1 RAW_PATCH_ANALYSIS
          . +1 the patch does not introduce any @author tags
          . +1 the patch does not introduce any tabs
          . +1 the patch does not introduce any trailing spaces
          . +1 the patch does not introduce any line longer than 132
          . +1 the patch does adds/modifies 5 testcase(s)
          +1 RAT
          . +1 the patch does not seem to introduce new RAT warnings
          +1 JAVADOC
          . +1 the patch does not seem to introduce new Javadoc warnings
          +1 COMPILE
          . +1 HEAD compiles
          . +1 patch compiles
          . +1 the patch does not seem to introduce new javac warnings
          +1 BACKWARDS_COMPATIBILITY
          . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
          . +1 the patch does not modify JPA files
          -1 TESTS
          . Tests run: 1694
          . Tests failed: 2
          . Tests errors: 0

          . The patch failed the following testcases:

          . testAdminInstrumentation(org.apache.oozie.client.TestOozieCLI)
          . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration)

          +1 DISTRO
          . +1 distro tarball builds with the patch

          ----------------------------
          -1 Overall result, please check the reported -1(s)

          The full output of the test-patch run is available at

          . https://builds.apache.org/job/oozie-trunk-precommit-build/2549/

          Show
          hadoopqa Hadoop QA added a comment - Testing JIRA OOZIE-2345 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN +1 RAW_PATCH_ANALYSIS . +1 the patch does not introduce any @author tags . +1 the patch does not introduce any tabs . +1 the patch does not introduce any trailing spaces . +1 the patch does not introduce any line longer than 132 . +1 the patch does adds/modifies 5 testcase(s) +1 RAT . +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC . +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE . +1 HEAD compiles . +1 patch compiles . +1 the patch does not seem to introduce new javac warnings +1 BACKWARDS_COMPATIBILITY . +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations . +1 the patch does not modify JPA files -1 TESTS . Tests run: 1694 . Tests failed: 2 . Tests errors: 0 . The patch failed the following testcases: . testAdminInstrumentation(org.apache.oozie.client.TestOozieCLI) . testForNoDuplicates(org.apache.oozie.event.TestEventGeneration) +1 DISTRO . +1 distro tarball builds with the patch ---------------------------- -1 Overall result, please check the reported -1(s) The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2549/
          Hide
          rohini Rohini Palaniswamy added a comment -

          +1 for OOZIE-2345-V8.patch. Test failures are unrelated

          Show
          rohini Rohini Palaniswamy added a comment - +1 for OOZIE-2345 -V8.patch. Test failures are unrelated
          Hide
          puru Purshotam Shah added a comment -

          Thanks for review. Committed to trunk.

          Show
          puru Purshotam Shah added a comment - Thanks for review. Committed to trunk.
          Hide
          rkanter Robert Kanter added a comment -

          Closing issue; Oozie 4.3.0 is released.

          Show
          rkanter Robert Kanter added a comment - Closing issue; Oozie 4.3.0 is released.

            People

            • Assignee:
              puru Purshotam Shah
              Reporter:
              puru Purshotam Shah
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development