Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-4388

Make writeStateMachineTimeout retry count proportional to node failure timeout

    XMLWordPrintableJSON

Details

    Description

      Currently, in ratis "writeStateMachinecall" gets retried indefinitely in event of a timeout. In case, where disks are slow/overloaded or number of chunk writer threads are not available for a period of 10s, writeStateMachine call times out in 10s. In cases like these, the same write chunk keeps on getting retried causing the same chunk of data to be overwritten. The idea here is to abort the request once the node failure timeout reaches.

      Attachments

        Issue Links

          Activity

            People

              shashikant Shashikant Banerjee
              shashikant Shashikant Banerjee
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: