Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-1735

Support resuming of failed coordinator job and rerun of a failed coordinator action

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.1.0
    • None
    • None

    Description

      We should support resuming of failed coordinator job. Job are set to failed if there are runtime error( like SQL timeout).
      In current scenario there is no way to recover beside running SQL.
      Resuming of failed coordinator job should also set pending to 1 ,reset doneMaterialization and last modified to current time. So that materialization continues.

      We should also provide an option of resuming failed action. The behavior will be same as killed option.

      Attachments

        1. OOZIE-1735_v1.patch
          20 kB
          Purshotam Shah
        2. OOZIE-1735-V2.patch
          20 kB
          Purshotam Shah
        3. OOZIE-1735-V2.patch
          20 kB
          Purshotam Shah
        4. OOZIE-1735-V3.patch
          20 kB
          Purshotam Shah

        Issue Links

          Activity

            People

              puru Purshotam Shah
              puru Purshotam Shah
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: