Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: Slider 0.80
    • Fix Version/s: Slider 1.0.0
    • Component/s: appmaster
    • Labels:
      None

      Description

      Phase 3 of placement enhancements

        Issue Links

          Activity

          Hide
          stevel@apache.org Steve Loughran added a comment -

          SLIDER-109 - Liveness detection should be part of this. If a service isn't responding, the failure tracking should detect and react to it.

          Possibly add a specific exit from the agent (with special text in the diagnostics?) to identify this locally, with the AM treating a liveness failure as possibly related to both the component and node

          Show
          stevel@apache.org Steve Loughran added a comment - SLIDER-109 - Liveness detection should be part of this. If a service isn't responding, the failure tracking should detect and react to it. Possibly add a specific exit from the agent (with special text in the diagnostics?) to identify this locally, with the AM treating a liveness failure as possibly related to both the component and node
          Hide
          stevel@apache.org Steve Loughran added a comment -

          With the SLIDER-799 placement management code, we should be able to implement anti-affinity ourselves

          1. get a topology map (at least list of nodes) from YARN
          2. pick some at random
          3. ask for them
          4. Escalate by not just relaxing placement, but by asking for other nodes in remaining in the set of not-yet-requested locations.
          5. fallback to relaxed.

          This would give parallel placement requests (=fast launch on a quiet cluster) and a strategy for dealing with failure to place

          Show
          stevel@apache.org Steve Loughran added a comment - With the SLIDER-799 placement management code, we should be able to implement anti-affinity ourselves get a topology map (at least list of nodes) from YARN pick some at random ask for them Escalate by not just relaxing placement, but by asking for other nodes in remaining in the set of not-yet-requested locations. fallback to relaxed. This would give parallel placement requests (=fast launch on a quiet cluster) and a strategy for dealing with failure to place

            People

            • Assignee:
              Unassigned
              Reporter:
              stevel@apache.org Steve Loughran
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 38h
                38h
                Remaining:
                Remaining Estimate - 38h
                38h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development