Uploaded image for project: 'Apache S4'
  1. Apache S4
  2. S4-44

optional backoff upon multiple consecutive failed checkpoint fetches

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.4, 0.5.0
    • 0.5.0
    • None

    Description

      if a checkpointing backend system becomes unresponsive (e.g. stalled NFS), and that a series of recoveries is to proceed (for instance, startup or failover), then each checkpoint fetching operation will block, wait for a timeout or another kind of exception, and the system will then continue without recovering this PE.

      We should provide a way to detect this pattern (multiple backend fetches failures in a short amount of time) and temporarily disable fetching from the backend, in order to reduce blocking when backend becomes unresponsive.

      Attachments

        Issue Links

          Activity

            People

              mmorel Matthieu Morel
              mmorel Matthieu Morel
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: