Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-274

Maintaining connectivity to Tajo master regardless of the restart of the Tajo master

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently, when you restart the Tajo master, you should restart all the workers and clients also.

      When client or worker has problem with connection to Tajo master due to the master restart, it needs to close the previous connection and try to reconnect to the master

      1. TAJO-274.patch_2
        103 kB
        Keuntae Park
      2. TAJO-274.patch
        92 kB
        Keuntae Park

        Activity

        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-trunk-postcommit #540 (See https://builds.apache.org/job/Tajo-trunk-postcommit/540/)
        TAJO-274: Maintaining connectivity to Tajo master regardless of the restart of the Tajo master. (Keuntae Park via hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=2f09450442bdbdda2a34f9ad4ec66643344a17fd)

        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • tajo-catalog/tajo-catalog-client/src/main/java/org/apache/tajo/catalog/CatalogClient.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterTask.java
        • tajo-rpc/src/test/java/org/apache/tajo/rpc/TestAsyncRpc.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/client/TajoClient.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/TaskRunner.java
        • CHANGES.txt
        • tajo-rpc/src/main/java/org/apache/tajo/rpc/RetriesExhaustedException.java
        • tajo-rpc/src/main/java/org/apache/tajo/rpc/RpcConnectionPool.java
        • tajo-rpc/src/main/proto/TestProtocol.proto
        • tajo-rpc/src/main/java/org/apache/tajo/rpc/CallFuture.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/cli/TajoCli.java
        • tajo-rpc/src/main/java/org/apache/tajo/rpc/BlockingRpcClient.java
        • tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/CatalogServer.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/TajoWorker.java
        • tajo-rpc/src/main/java/org/apache/tajo/rpc/NettyClientBase.java
        • tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/LocalCatalogWrapper.java
        • tajo-catalog/tajo-catalog-client/src/main/java/org/apache/tajo/catalog/AbstractCatalogClient.java
        • tajo-rpc/src/main/java/org/apache/tajo/rpc/AsyncRpcClient.java
        • tajo-rpc/src/main/java/org/apache/tajo/rpc/DefaultRpcController.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/TajoResourceAllocator.java
        • tajo-rpc/src/test/java/org/apache/tajo/rpc/TestBlockingRpc.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TajoContainerProxy.java
        • tajo-rpc/src/main/java/org/apache/tajo/rpc/ServerCallable.java
        • tajo-rpc/src/test/java/org/apache/tajo/rpc/test/impl/DummyProtocolAsyncImpl.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/Task.java
        • tajo-rpc/src/test/java/org/apache/tajo/rpc/test/impl/DummyProtocolBlockingImpl.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-trunk-postcommit #540 (See https://builds.apache.org/job/Tajo-trunk-postcommit/540/ ) TAJO-274 : Maintaining connectivity to Tajo master regardless of the restart of the Tajo master. (Keuntae Park via hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=2f09450442bdbdda2a34f9ad4ec66643344a17fd ) tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-catalog/tajo-catalog-client/src/main/java/org/apache/tajo/catalog/CatalogClient.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterTask.java tajo-rpc/src/test/java/org/apache/tajo/rpc/TestAsyncRpc.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/client/TajoClient.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/TaskRunner.java CHANGES.txt tajo-rpc/src/main/java/org/apache/tajo/rpc/RetriesExhaustedException.java tajo-rpc/src/main/java/org/apache/tajo/rpc/RpcConnectionPool.java tajo-rpc/src/main/proto/TestProtocol.proto tajo-rpc/src/main/java/org/apache/tajo/rpc/CallFuture.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/cli/TajoCli.java tajo-rpc/src/main/java/org/apache/tajo/rpc/BlockingRpcClient.java tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/CatalogServer.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/TajoWorker.java tajo-rpc/src/main/java/org/apache/tajo/rpc/NettyClientBase.java tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/LocalCatalogWrapper.java tajo-catalog/tajo-catalog-client/src/main/java/org/apache/tajo/catalog/AbstractCatalogClient.java tajo-rpc/src/main/java/org/apache/tajo/rpc/AsyncRpcClient.java tajo-rpc/src/main/java/org/apache/tajo/rpc/DefaultRpcController.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/TajoResourceAllocator.java tajo-rpc/src/test/java/org/apache/tajo/rpc/TestBlockingRpc.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TajoContainerProxy.java tajo-rpc/src/main/java/org/apache/tajo/rpc/ServerCallable.java tajo-rpc/src/test/java/org/apache/tajo/rpc/test/impl/DummyProtocolAsyncImpl.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/Task.java tajo-rpc/src/test/java/org/apache/tajo/rpc/test/impl/DummyProtocolBlockingImpl.java
        Hide
        hyunsik Hyunsik Choi added a comment -

        +1

        Great work. Thank you for your contribution. I've removed some unused imports and done some trivial refactoring. Then, I've committed the patch, fixing it as resolved.

        Show
        hyunsik Hyunsik Choi added a comment - +1 Great work. Thank you for your contribution. I've removed some unused imports and done some trivial refactoring. Then, I've committed the patch, fixing it as resolved.
        Hide
        hyunsik Hyunsik Choi added a comment -

        I'll take a look at this patch today's night.

        Show
        hyunsik Hyunsik Choi added a comment - I'll take a look at this patch today's night.
        Hide
        sirpkt Keuntae Park added a comment -

        Thank you for the review, Jihoon.
        (Sorry for the late reply)

        I've uploaded the new patch for the issue.

        • Actually, connectToTajoMaster() method is no longer needed because of new reconnecting logic, hence, remove it.
        • set default controller and add exception handling in async rpc.
        • rebase
        Show
        sirpkt Keuntae Park added a comment - Thank you for the review, Jihoon. (Sorry for the late reply) I've uploaded the new patch for the issue. Actually, connectToTajoMaster() method is no longer needed because of new reconnecting logic, hence, remove it. set default controller and add exception handling in async rpc. rebase
        Hide
        jihoonson Jihoon Son added a comment - - edited

        Thanks for your contribution,
        but this patch includes some strange codes like this.

        In TajoWorker
        private void connectToTajoMaster(String tajoMasterAddrString) {
          LOG.info("Connecting to TajoMaster (" + tajoMasterAddrString +")");
          this.tajoMasterAddress = NetUtils.createSocketAddr(tajoMasterAddrString);
        
          while(true) {
            try {
        //        tajoMasterRpc = new AsyncRpcClient(TajoMasterProtocol.class, this.tajoMasterAddress);
        //        tajoMasterRpcClient = tajoMasterRpc.getStub();
              break;
            } catch (Exception e) {
              LOG.error("Can't connect to TajoMaster[" + NetUtils.normalizeInetSocketAddress(tajoMasterAddress) + "], "
                  + e.getMessage(), e);
            }
             try {
              Thread.sleep(3000);
            } catch (InterruptedException e) {
            }
          }
        }
        
        Show
        jihoonson Jihoon Son added a comment - - edited Thanks for your contribution, but this patch includes some strange codes like this. In TajoWorker private void connectToTajoMaster( String tajoMasterAddrString) { LOG.info( "Connecting to TajoMaster (" + tajoMasterAddrString + ")" ); this .tajoMasterAddress = NetUtils.createSocketAddr(tajoMasterAddrString); while ( true ) { try { // tajoMasterRpc = new AsyncRpcClient(TajoMasterProtocol.class, this .tajoMasterAddress); // tajoMasterRpcClient = tajoMasterRpc.getStub(); break ; } catch (Exception e) { LOG.error( "Can't connect to TajoMaster[" + NetUtils.normalizeInetSocketAddress(tajoMasterAddress) + "], " + e.getMessage(), e); } try { Thread .sleep(3000); } catch (InterruptedException e) { } } }
        Hide
        sirpkt Keuntae Park added a comment -

        I've uploaded the patch for the issue

        • It adds Singleton based connection management (RpcConnectionPool)
          This is not a traditional connection pool but managing a connection object because one NettyClientBase object can already handle multiple calls concurrently and also it is rare that one server has multiple connection at the same time.
        • Connection retry is added in TajoClient, Heartbeat request from Worker to TajoMaster, and CatalogClient.
        Show
        sirpkt Keuntae Park added a comment - I've uploaded the patch for the issue It adds Singleton based connection management (RpcConnectionPool) This is not a traditional connection pool but managing a connection object because one NettyClientBase object can already handle multiple calls concurrently and also it is rare that one server has multiple connection at the same time. Connection retry is added in TajoClient, Heartbeat request from Worker to TajoMaster, and CatalogClient. This patch also resolves the following issues: TAJO-153 TAJO-138

          People

          • Assignee:
            sirpkt Keuntae Park
            Reporter:
            sirpkt Keuntae Park
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development