Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-994

ActionCheckXCommand does not handle failures properly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 3.2.0
    • 3.3.0
    • workflow
    • None

    Description

      If the JT restarts or dies and running jobs are lost or the JT is not reachable, Oozie ActionCheckXCommand will never fail the workflow job.

      There seem to be 2 issues here:

      • convertException is not receiving the root cause exception anytmore, but alway HadoopAccessorException wrapping the root cause exception. We should modify the convertException to inspect the cause exception as well.
      • ActionCheckXCommand does not do the handle retry logic of ActionStartXCommand.

      Attachments

        1. OOZIE-994.patch
          22 kB
          Robert Kanter
        2. OOZIE-994.patch
          22 kB
          Robert Kanter
        3. OOZIE-994.patch
          22 kB
          Robert Kanter
        4. OOZIE-994.patch
          26 kB
          Robert Kanter
        5. OOZIE-994.patch
          27 kB
          Robert Kanter
        6. OOZIE-994.patch
          28 kB
          Robert Kanter
        7. OOZIE-994.patch
          28 kB
          Robert Kanter

        Issue Links

          Activity

            People

              rkanter Robert Kanter
              tucu00 Alejandro Abdelnur
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: