Uploaded image for project: 'Sling'
  1. Sling
  2. SLING-11181

Emit metrics that distinguish transient and permanent distribution failures

    XMLWordPrintableJSON

Details

    Description

      Context

      Currently, our error metrics don't distinguish between distribution failures that are permanent and will fail even if retried, or failures that succeed after being retried.
      We want to improve this in order to be able to differentiate both scenarios.

      Solution

      Failure metric should be labeled by:

      • Transient failure
      • Permanent failure

      Proposed approach

      We can distinguish both these scenarios by using the following rationale:

      • Transient failures happen whenever a package is distributed successfully but had more than 1 attempt at being distributed: retries > 0

       

      Attachments

        Issue Links

          Activity

            People

              jose-correia José Correia
              jose-correia José Correia
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: