Uploaded image for project: 'Ratis'
  1. Ratis
  2. RATIS-726

TimeoutScheduler holds on to the raftClientRequest till it times out even though request succeeds

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.5.0
    • client
    • None

    Description

      While running freon with 1 Node ratis, it was observed that the TimeoutScheduler holds on to the raftClientObject atleast for 3s(default for requestTimeoutDuration) even though the request is processed successfully and acknowledged back. This ends up creating a memory pressure causing ozone client to go OOM .

       Heapdump analysis of HDDS-2331 , it seems the timeout schduler holding onto total of 176 requests, (88 of writeChunk containing actual data and 88 putBlock requests) although data write is happening sequentially key by key in ozone.

      Thanks adoroszlai for helping out discovering this.

      cc ~ ljain msingh szetszwo jnpandey

      Similar fix may be required in GrpCLogAppender as well it uses the same TimeoutScheduler.

      Attachments

        1. r726_20191022.patch
          5 kB
          Tsz-wo Sze

        Issue Links

          Activity

            People

              szetszwo Tsz-wo Sze
              shashikant Shashikant Banerjee
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: