Uploaded image for project: 'Tajo (Retired)'
  1. Tajo (Retired)
  2. TAJO-1399

TajoResourceAllocator might hang on network error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • 0.11.0
    • RPC
    • None

    Description

      CallFuture<WorkerResourceAllocationResponse> callBack = new CallFuture<WorkerResourceAllocationResponse>();
      
      ...
      
      RpcConnectionPool connPool = RpcConnectionPool.getPool();
      NettyClientBase tmClient = null;
      try {
        ServiceTracker serviceTracker = queryTaskContext.getQueryMasterContext().getWorkerContext().getServiceTracker();
        tmClient = connPool.getConnection(serviceTracker.getUmbilicalAddress(), QueryCoordinatorProtocol.class, true);
        QueryCoordinatorProtocolService masterClientService = tmClient.getStub();
        masterClientService.allocateWorkerResources(null, request, callBack);
      } catch (Throwable e) {
        LOG.error(e.getMessage(), e);
      } finally {
        connPool.releaseConnection(tmClient);
      }
      
      WorkerResourceAllocationResponse response = null;
      while(!stopped.get()) {
        try {
          response = callBack.get(3, TimeUnit.SECONDS);
          ...
      

      If "callBack" is not registered properly in netty by failed connection, etc., allocator thread would block on empty future forever, possibly making thread leakage.

      Attachments

        Issue Links

          Activity

            People

              navis Navis Ryu
              navis Navis Ryu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: