Uploaded image for project: 'Ratis'
  1. Ratis
  2. RATIS-485

TimeoutScheduler is leaked by gRPC client implementation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.4.0
    • examples
    • None

    Description

      Running the load generator without a Ratis cluster (e.g. spurious node IPs) results in an OOM.

      If one has a single Ratis server it tries seemingly indefinitely:

      vagrant@ratis-server:~/incubator-ratis$ ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576 --numFiles 100 --peers n0:127.0.0.1:1

      If one has two Ratis servers it OOMs:

      vagrant@ratis-server:~/incubator-ratis$ ./ratis-examples/src/main/bin/client.sh filestore loadgen --size 1048576 --numFiles 100 --peers n0:127.0.0.1:1,n1:127.0.0.1:2
      [...]
      1/787867107@5e5792a0 with java.util.concurrent.CompletionException: java.io.IOException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
      2019-02-14 07:47:22 DEBUG RaftClient:417 - client-272A2E13A5DD: suggested new leader: null. Failed RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0 with java.io.IOException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
      2019-02-14 07:47:22 DEBUG RaftClient:437 - client-272A2E13A5DD: change Leader from n1 to n0
      2019-02-14 07:47:22 DEBUG RaftClient:291 - schedule attempt #10740 with policy RetryForeverNoSleep for RaftClientRequest:client-272A2E13A5DD->n1@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
      2019-02-14 07:47:22 DEBUG RaftClient:323 - client-272A2E13A5DD: send* RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0
      2019-02-14 07:47:22 DEBUG RaftClient:338 - client-272A2E13A5DD: Failed RaftClientRequest:client-272A2E13A5DD->n0@group-6F7570313233, cid=0, seq=0 RW, org.apache.ratis.examples.filestore.FileStoreClient$$Lambda$41/787867107@5e5792a0 with java.util.concurrent.CompletionException: java.lang.OutOfMemoryError: unable to create new native thread
      Exception in thread "main" java.util.concurrent.CompletionException: java.lang.OutOfMemoryError: unable to create new native thread
              at org.apache.ratis.client.impl.RaftClientImpl.lambda$sendRequestAsync$14(RaftClientImpl.java:349)
              at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
              at java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:884)
              at java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2196)
              at org.apache.ratis.client.impl.RaftClientImpl.sendRequestAsync(RaftClientImpl.java:334)
              at org.apache.ratis.client.impl.RaftClientImpl.sendRequestWithRetryAsync(RaftClientImpl.java:286)
              at org.apache.ratis.util.SlidingWindow$Client.sendOrDelayRequest(SlidingWindow.java:243)
              at org.apache.ratis.util.SlidingWindow$Client.retry(SlidingWindow.java:259)
              at org.apache.ratis.client.impl.RaftClientImpl.lambda$null$10(RaftClientImpl.java:293)
              at org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$0(TimeoutScheduler.java:85)
              at org.apache.ratis.util.TimeoutScheduler.lambda$onTimeout$1(TimeoutScheduler.java:104)
              at org.apache.ratis.util.LogUtils.runAndLog(LogUtils.java:50)
              at org.apache.ratis.util.LogUtils$1.run(LogUtils.java:91)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.OutOfMemoryError: unable to create new native thread
              at java.lang.Thread.start0(Native Method)
              at java.lang.Thread.start(Thread.java:717)
              at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
              at java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1603)
              at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:334)
              at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
              at org.apache.ratis.util.TimeoutScheduler.schedule(TimeoutScheduler.java:117)
              at org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:104)
              at org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:82)
              at org.apache.ratis.util.TimeoutScheduler.onTimeout(TimeoutScheduler.java:134)
              at org.apache.ratis.grpc.client.GrpcClientProtocolClient$AsyncStreamObservers.onNext(GrpcClientProtocolClient.java:234)
              at org.apache.ratis.grpc.client.GrpcClientRpc.sendRequestAsync(GrpcClientRpc.java:71)
              at org.apache.ratis.client.impl.RaftClientImpl.sendRequestAsync(RaftClientImpl.java:324)
              ... 15 more
      

      Attachments

        1. RATIS-485.004.patch
          16 kB
          Tsz-wo Sze
        2. RATIS-485.003.patch
          17 kB
          Josh Elser
        3. r485_20190828.patch
          13 kB
          Tsz-wo Sze
        4. r485_20190827.patch
          1 kB
          Tsz-wo Sze
        5. loadgen.log
          29.37 MB
          Clay B.

        Issue Links

          Activity

            People

              szetszwo Tsz-wo Sze
              clayb Clay B.
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: