Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1221

HA TajoClient should not connect TajoMaster at the first.

    Details

      Description

      Problem

      We TajoClient is opened, TajoClient initially tries to connect TajoMaster. This manner does not guarantee high availability. A known TajoMaster host may be not work anymore. So, this manner still has some failure point.

      Also, this manner prohibits a Tajo cluster to run on some dynamic cluster environments like Yarn.

      Solution

      Tajo HA client should get directly TajoMaster addresses and others from HA component without contacting TajoMaster.

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/282#issuecomment-65736135

        I'll create new pull request.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/282#issuecomment-65736135 I'll create new pull request.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner closed the pull request at:

        https://github.com/apache/tajo/pull/282

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner closed the pull request at: https://github.com/apache/tajo/pull/282
        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user blrunner opened a pull request:

        https://github.com/apache/tajo/pull/286

        TAJO-1221: HA TajoClient should not connect TajoMaster at the first.

        I added a method to update TableStats at CatalogStore. For reference, all unit cases for physical operators verify output row numbers and it finished successfully with my patch. So, I didn't implement unit test cases. In addition, HCatalogStore::updateTableStats is dummy method. Because hive don't provide no api to update table row numbers.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/blrunner/tajo TAJO-1221

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/286.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #286


        commit 9dc6cedce24a999c8f01fa701d4bc74e18642e28
        Author: JaeHwa Jung <blrunner@apache.org>
        Date: 2014-12-04T03:01:19Z

        TAJO-1221: HA TajoClient should not connect TajoMaster at the first.


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user blrunner opened a pull request: https://github.com/apache/tajo/pull/286 TAJO-1221 : HA TajoClient should not connect TajoMaster at the first. I added a method to update TableStats at CatalogStore. For reference, all unit cases for physical operators verify output row numbers and it finished successfully with my patch. So, I didn't implement unit test cases. In addition, HCatalogStore::updateTableStats is dummy method. Because hive don't provide no api to update table row numbers. You can merge this pull request into a Git repository by running: $ git pull https://github.com/blrunner/tajo TAJO-1221 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/286.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #286 commit 9dc6cedce24a999c8f01fa701d4bc74e18642e28 Author: JaeHwa Jung <blrunner@apache.org> Date: 2014-12-04T03:01:19Z TAJO-1221 : HA TajoClient should not connect TajoMaster at the first.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/286#discussion_r21439028

        — Diff: tajo-common/src/main/java/org/apache/tajo/util/NetUtils.java —
        @@ -30,6 +35,16 @@ public static InetSocketAddress createSocketAddr(String addr)

        { return new InetSocketAddress(splitted[0], Integer.parseInt(splitted[1])); }

        + public static InetSocketAddress createLocalSocketAddr(InetSocketAddress addr) {
        — End diff –

        Could you explain its purpose?

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/286#discussion_r21439028 — Diff: tajo-common/src/main/java/org/apache/tajo/util/NetUtils.java — @@ -30,6 +35,16 @@ public static InetSocketAddress createSocketAddr(String addr) { return new InetSocketAddress(splitted[0], Integer.parseInt(splitted[1])); } + public static InetSocketAddress createLocalSocketAddr(InetSocketAddress addr) { — End diff – Could you explain its purpose?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/286#discussion_r21439057

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/ha/HAServiceHDFSImpl.java —
        @@ -169,6 +173,48 @@ private void createMasterFile(boolean isActive) throws IOException {
        }

        + private String getHostName(String hostAddress, int type) {
        + String hostName = null;
        + int port = 0;
        +
        + switch (type) {
        + case 1:
        — End diff –

        I think that the direct use of number makes code readability bad. I'd like to suggest using some constants or Enum type.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/286#discussion_r21439057 — Diff: tajo-core/src/main/java/org/apache/tajo/master/ha/HAServiceHDFSImpl.java — @@ -169,6 +173,48 @@ private void createMasterFile(boolean isActive) throws IOException { } + private String getHostName(String hostAddress, int type) { + String hostName = null; + int port = 0; + + switch (type) { + case 1: — End diff – I think that the direct use of number makes code readability bad. I'd like to suggest using some constants or Enum type.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/286#discussion_r21440019

        — Diff: tajo-common/src/main/java/org/apache/tajo/util/NetUtils.java —
        @@ -30,6 +35,16 @@ public static InetSocketAddress createSocketAddr(String addr)

        { return new InetSocketAddress(splitted[0], Integer.parseInt(splitted[1])); }

        + public static InetSocketAddress createLocalSocketAddr(InetSocketAddress addr) {
        — End diff –

        Main components (ex: CatalogServer, TajoResourceTracker) create an InetSocketAddress with NetUtils::createSocketAddr and TajoConf. But if the address set to "0.0.0.0" at TajoConf, NetUtils::createSocketAddr bind socket address to "localhost" unconditionally. It does matter on Yarn with TajoMaster Ha. If socket address bind to "localohost" in Netty, TajoWorker never can't connect the component. So, I set local ip address with NetUtils::createLocalSocketAddr. If you have any good idea, suggest to me easefully.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on a diff in the pull request: https://github.com/apache/tajo/pull/286#discussion_r21440019 — Diff: tajo-common/src/main/java/org/apache/tajo/util/NetUtils.java — @@ -30,6 +35,16 @@ public static InetSocketAddress createSocketAddr(String addr) { return new InetSocketAddress(splitted[0], Integer.parseInt(splitted[1])); } + public static InetSocketAddress createLocalSocketAddr(InetSocketAddress addr) { — End diff – Main components (ex: CatalogServer, TajoResourceTracker) create an InetSocketAddress with NetUtils::createSocketAddr and TajoConf. But if the address set to "0.0.0.0" at TajoConf, NetUtils::createSocketAddr bind socket address to "localhost" unconditionally. It does matter on Yarn with TajoMaster Ha. If socket address bind to "localohost" in Netty, TajoWorker never can't connect the component. So, I set local ip address with NetUtils::createLocalSocketAddr. If you have any good idea, suggest to me easefully.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/286#issuecomment-66037887

        Thanks @hyunsik.

        I've just added a constant class for HA. And I leaved some comments for NetUtils::createLocalSocketAddr.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/286#issuecomment-66037887 Thanks @hyunsik. I've just added a constant class for HA. And I leaved some comments for NetUtils::createLocalSocketAddr.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/286#issuecomment-66219990

        You seem to merge your patch with wrong revisions. Could you check the submitted patch?

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/286#issuecomment-66219990 You seem to merge your patch with wrong revisions. Could you check the submitted patch?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/286#issuecomment-66223292

        Thanks @hyunsik

        I merged the master branch to TAJO-1221 again.
        Could you check it again?

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/286#issuecomment-66223292 Thanks @hyunsik I merged the master branch to TAJO-1221 again. Could you check it again?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/286#issuecomment-66734250

        Hi @hyunsik.

        I removed NetUtils::createLocalSocketAddr. And we need to guide for users to use rpc server address '0.0.0.0' instead of 'localhost' on yarn cluster.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/286#issuecomment-66734250 Hi @hyunsik. I removed NetUtils::createLocalSocketAddr. And we need to guide for users to use rpc server address '0.0.0.0' instead of 'localhost' on yarn cluster.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/286#discussion_r21731837

        — Diff: tajo-core/src/main/java/org/apache/tajo/master/ha/HAServiceHDFSImpl.java —
        @@ -169,6 +174,48 @@ private void createMasterFile(boolean isActive) throws IOException {
        }

        + private String getHostName(String hostAddress, int type) {
        + String hostName = null;
        + int port = 0;
        +
        + switch (type) {
        + case HAConstants.MASTER_UMBILICAL_RPC_ADDRESS:
        + hostName = context.getConf().get(TajoConf.ConfVars.TAJO_MASTER_UMBILICAL_RPC_ADDRESS
        + .varname);
        + port = 26001;
        — End diff –

        The port is hard coded. any have reason ?

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on a diff in the pull request: https://github.com/apache/tajo/pull/286#discussion_r21731837 — Diff: tajo-core/src/main/java/org/apache/tajo/master/ha/HAServiceHDFSImpl.java — @@ -169,6 +174,48 @@ private void createMasterFile(boolean isActive) throws IOException { } + private String getHostName(String hostAddress, int type) { + String hostName = null; + int port = 0; + + switch (type) { + case HAConstants.MASTER_UMBILICAL_RPC_ADDRESS: + hostName = context.getConf().get(TajoConf.ConfVars.TAJO_MASTER_UMBILICAL_RPC_ADDRESS + .varname); + port = 26001; — End diff – The port is hard coded. any have reason ?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/286#issuecomment-66958836

        Thanks @jinossy ,
        I removed unnecessary codes. Could you check it again?

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/286#issuecomment-66958836 Thanks @jinossy , I removed unnecessary codes. Could you check it again?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on the pull request:

        https://github.com/apache/tajo/pull/286#issuecomment-67298944

        +1
        Looks good to me.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/286#issuecomment-67298944 +1 Looks good to me.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/286

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/286
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user blrunner commented on the pull request:

        https://github.com/apache/tajo/pull/286#issuecomment-67433929

        Thanks @jinossy.
        I've just committed it to the master branch.

        Show
        githubbot ASF GitHub Bot added a comment - Github user blrunner commented on the pull request: https://github.com/apache/tajo/pull/286#issuecomment-67433929 Thanks @jinossy. I've just committed it to the master branch.
        Hide
        blrunner Jaehwa Jung added a comment -

        I've just committed it to the master branch.

        Show
        blrunner Jaehwa Jung added a comment - I've just committed it to the master branch.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #149 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/149/)
        TAJO-1221: HA TajoClient should not connect TajoMaster at the first. (jaehwa) (blrunner: rev 2548768cfed8f018727f6f054317b2adb4d7f485)

        • tajo-core/src/main/java/org/apache/tajo/master/TajoContainerProxy.java
        • tajo-core/src/main/java/org/apache/tajo/worker/rule/ConnectivityCheckerRuleForTajoWorker.java
        • tajo-core/src/main/java/org/apache/tajo/worker/WorkerHeartbeatService.java
        • tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMasterTask.java
        • tajo-client/src/main/java/org/apache/tajo/client/TajoHAClientUtil.java
        • tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java
        • tajo-client/src/main/java/org/apache/tajo/cli/tools/TajoHAAdmin.java
        • tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryJobManager.java
        • CHANGES
        • tajo-catalog/tajo-catalog-client/src/main/java/org/apache/tajo/catalog/AbstractCatalogClient.java
        • tajo-common/src/main/java/org/apache/tajo/util/HAServiceUtil.java
        • tajo-core/src/test/java/org/apache/tajo/master/ha/TestHAServiceHDFSImpl.java
        • tajo-client/src/main/java/org/apache/tajo/cli/tools/TajoAdmin.java
        • tajo-client/src/main/java/org/apache/tajo/client/SessionConnection.java
        • tajo-core/src/main/java/org/apache/tajo/worker/TajoWorker.java
        • tajo-client/src/main/java/org/apache/tajo/client/TajoClientImpl.java
        • tajo-common/src/main/java/org/apache/tajo/ha/HAServiceUtil.java
        • tajo-core/src/test/java/org/apache/tajo/TajoTestingCluster.java
        • tajo-core/src/main/java/org/apache/tajo/worker/TajoResourceAllocator.java
        • tajo-common/src/main/java/org/apache/tajo/ha/HAConstants.java
        • tajo-core/src/main/java/org/apache/tajo/master/ha/HAServiceHDFSImpl.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #149 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/149/ ) TAJO-1221 : HA TajoClient should not connect TajoMaster at the first. (jaehwa) (blrunner: rev 2548768cfed8f018727f6f054317b2adb4d7f485) tajo-core/src/main/java/org/apache/tajo/master/TajoContainerProxy.java tajo-core/src/main/java/org/apache/tajo/worker/rule/ConnectivityCheckerRuleForTajoWorker.java tajo-core/src/main/java/org/apache/tajo/worker/WorkerHeartbeatService.java tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMasterTask.java tajo-client/src/main/java/org/apache/tajo/client/TajoHAClientUtil.java tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java tajo-client/src/main/java/org/apache/tajo/cli/tools/TajoHAAdmin.java tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryJobManager.java CHANGES tajo-catalog/tajo-catalog-client/src/main/java/org/apache/tajo/catalog/AbstractCatalogClient.java tajo-common/src/main/java/org/apache/tajo/util/HAServiceUtil.java tajo-core/src/test/java/org/apache/tajo/master/ha/TestHAServiceHDFSImpl.java tajo-client/src/main/java/org/apache/tajo/cli/tools/TajoAdmin.java tajo-client/src/main/java/org/apache/tajo/client/SessionConnection.java tajo-core/src/main/java/org/apache/tajo/worker/TajoWorker.java tajo-client/src/main/java/org/apache/tajo/client/TajoClientImpl.java tajo-common/src/main/java/org/apache/tajo/ha/HAServiceUtil.java tajo-core/src/test/java/org/apache/tajo/TajoTestingCluster.java tajo-core/src/main/java/org/apache/tajo/worker/TajoResourceAllocator.java tajo-common/src/main/java/org/apache/tajo/ha/HAConstants.java tajo-core/src/main/java/org/apache/tajo/master/ha/HAServiceHDFSImpl.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #508 (See https://builds.apache.org/job/Tajo-master-build/508/)
        TAJO-1221: HA TajoClient should not connect TajoMaster at the first. (jaehwa) (blrunner: rev 2548768cfed8f018727f6f054317b2adb4d7f485)

        • tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java
        • tajo-core/src/test/java/org/apache/tajo/TajoTestingCluster.java
        • tajo-common/src/main/java/org/apache/tajo/util/HAServiceUtil.java
        • tajo-core/src/main/java/org/apache/tajo/worker/TajoResourceAllocator.java
        • tajo-catalog/tajo-catalog-client/src/main/java/org/apache/tajo/catalog/AbstractCatalogClient.java
        • tajo-client/src/main/java/org/apache/tajo/cli/tools/TajoHAAdmin.java
        • tajo-core/src/main/java/org/apache/tajo/worker/WorkerHeartbeatService.java
        • tajo-core/src/main/java/org/apache/tajo/worker/TajoWorker.java
        • tajo-client/src/main/java/org/apache/tajo/client/TajoHAClientUtil.java
        • CHANGES
        • tajo-client/src/main/java/org/apache/tajo/cli/tools/TajoAdmin.java
        • tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMasterTask.java
        • tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryJobManager.java
        • tajo-core/src/main/java/org/apache/tajo/master/TajoContainerProxy.java
        • tajo-core/src/main/java/org/apache/tajo/master/ha/HAServiceHDFSImpl.java
        • tajo-core/src/main/java/org/apache/tajo/worker/rule/ConnectivityCheckerRuleForTajoWorker.java
        • tajo-common/src/main/java/org/apache/tajo/ha/HAServiceUtil.java
        • tajo-core/src/test/java/org/apache/tajo/master/ha/TestHAServiceHDFSImpl.java
        • tajo-client/src/main/java/org/apache/tajo/client/TajoClientImpl.java
        • tajo-client/src/main/java/org/apache/tajo/client/SessionConnection.java
        • tajo-common/src/main/java/org/apache/tajo/ha/HAConstants.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #508 (See https://builds.apache.org/job/Tajo-master-build/508/ ) TAJO-1221 : HA TajoClient should not connect TajoMaster at the first. (jaehwa) (blrunner: rev 2548768cfed8f018727f6f054317b2adb4d7f485) tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java tajo-core/src/test/java/org/apache/tajo/TajoTestingCluster.java tajo-common/src/main/java/org/apache/tajo/util/HAServiceUtil.java tajo-core/src/main/java/org/apache/tajo/worker/TajoResourceAllocator.java tajo-catalog/tajo-catalog-client/src/main/java/org/apache/tajo/catalog/AbstractCatalogClient.java tajo-client/src/main/java/org/apache/tajo/cli/tools/TajoHAAdmin.java tajo-core/src/main/java/org/apache/tajo/worker/WorkerHeartbeatService.java tajo-core/src/main/java/org/apache/tajo/worker/TajoWorker.java tajo-client/src/main/java/org/apache/tajo/client/TajoHAClientUtil.java CHANGES tajo-client/src/main/java/org/apache/tajo/cli/tools/TajoAdmin.java tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryMasterTask.java tajo-core/src/main/java/org/apache/tajo/master/querymaster/QueryJobManager.java tajo-core/src/main/java/org/apache/tajo/master/TajoContainerProxy.java tajo-core/src/main/java/org/apache/tajo/master/ha/HAServiceHDFSImpl.java tajo-core/src/main/java/org/apache/tajo/worker/rule/ConnectivityCheckerRuleForTajoWorker.java tajo-common/src/main/java/org/apache/tajo/ha/HAServiceUtil.java tajo-core/src/test/java/org/apache/tajo/master/ha/TestHAServiceHDFSImpl.java tajo-client/src/main/java/org/apache/tajo/client/TajoClientImpl.java tajo-client/src/main/java/org/apache/tajo/client/SessionConnection.java tajo-common/src/main/java/org/apache/tajo/ha/HAConstants.java

          People

          • Assignee:
            blrunner Jaehwa Jung
            Reporter:
            hyunsik Hyunsik Choi
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development