Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-1947

Send WorkUnitChangeEvent when helix task consistently fail

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • gobblin-cluster
    • None

    Description

      When YarnAutoScalingManager detect helix task consistently fail, give an option to send WorkUnitChangeEvent to let GobblinHelixJobLauncher handle the event and split the work unit during runtime. This can help resolving consistent failing containers issue(like OOM) during runtime instead of relying on replaner to restart the whole pipeline

      Attachments

        Activity

          People

            hutran Hung Tran
            hanghangliu Hanghang Liu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 40m
                1h 40m