XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • None
    • appmaster
    • None
    • Slider August #1, Slider August #2, Slider September #1

    Description

      Use sliding windows and/or weighted moving averages to track container failures over time, and only react if many are failing in a short period.

      What we do want to do here is react fast to a sudden series of failures, as well as look at average failure rates over time. I think separating startup failures from operational failures could help here. We don't want 5 failures in 5 minutes to be ignored just because everything worked well for the previous month

      Attachments

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Time Spent - 5h Remaining Estimate - 1h
                  1h
                  Logged:
                  Time Spent - 5h Remaining Estimate - 1h
                  5h