Uploaded image for project: 'Slider'
  1. Slider
  2. SLIDER-758

Slider placement requests to skip unreliable nodes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • Slider 0.60
    • Slider 0.70
    • appmaster
    • None
    • Slider Jan #2

    Description

      As discussed on the developer list; slider's "prefer previously used nodes" is biased towards recently used nodes —even when those nodes are failing to successfully launch containers.

      As we already track node failure rates, the placement logic can be enhanced to not generate "placed" requests on nodes with a (recent) failure history of that component type.

      The initial iteration of this feature will not use the YARN blacklisting APIs, instead build up history in the AM, history that will be lost on AM restart. Accordingly, even unplaced requests may end being scheduled on the unreliable nodes.

      This strategy (which we could revisit in future), combined with a regular reset of the failure counters, stops slider blacklisting nodes whose failure rate was high some time previously —but which is now reliable again.

      Testing: primarily via mocking

      Attachments

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 2h
                  2h
                  Remaining:
                  Remaining Estimate - 2h
                  2h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified