Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-336

Separate catalog stores into separate modules

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: Build, Catalog
    • Labels:
      None

      Description

      Catalog stores, such as HCatalogStore, involves dependencies to other systems. Their dependencies may be conflict from Tajo's dependencies because the dependent library versions can be varied in many cases.

      As to this problem, separating catalog stores from catalog-server module would be a good solution.

      1. TAJO-336_2.patch
        68 kB
        Jaehwa Jung
      2. TAJO-336_3.patch
        67 kB
        Jaehwa Jung
      3. TAJO-336.patch
        58 kB
        Jaehwa Jung

        Issue Links

          Activity

          Hide
          blrunner Jaehwa Jung added a comment -

          I entirely agree. Even now, their dependencies cause some problems.
          If there isn't anyone who wants to resolve it, I want to resolve this issue
          What's your opinion?

          Show
          blrunner Jaehwa Jung added a comment - I entirely agree. Even now, their dependencies cause some problems. If there isn't anyone who wants to resolve it, I want to resolve this issue What's your opinion?
          Hide
          jihoonson Jihoon Son added a comment -

          I think that this issue is blocking TAJO-135.
          I assigned this issue to JaeHwa.
          Thanks.

          Show
          jihoonson Jihoon Son added a comment - I think that this issue is blocking TAJO-135 . I assigned this issue to JaeHwa. Thanks.
          Hide
          blrunner Jaehwa Jung added a comment -

          I made a directory named tajo-hive which is located on root directory. If user want to use HCatalogStore, user must build on tajo-hive directory. And this patch don't affect main build process.

          Show
          blrunner Jaehwa Jung added a comment - I made a directory named tajo-hive which is located on root directory. If user want to use HCatalogStore, user must build on tajo-hive directory. And this patch don't affect main build process.
          Hide
          hyunsik Hyunsik Choi added a comment -

          Nice job!

          In addition, I would like to give additional suggestions for better packaging.

          • In my view, the name 'tajo-hive' is too general.
          • We should use maven profile to enable only necessary drivers.
            • Tajo can have many catalog store drivers, and the drivers can conflict to one another.
          • We should allow 'mvn package -Ddist' to package enabled drivers.
            • Otherwise, users should directly handle dependencies of certain drivers.

          If the above suggestions are burden to you. I can take this work. Thanks!

          Show
          hyunsik Hyunsik Choi added a comment - Nice job! In addition, I would like to give additional suggestions for better packaging. In my view, the name 'tajo-hive' is too general. We should use maven profile to enable only necessary drivers. Tajo can have many catalog store drivers, and the drivers can conflict to one another. We should allow 'mvn package -Ddist' to package enabled drivers. Otherwise, users should directly handle dependencies of certain drivers. If the above suggestions are burden to you. I can take this work. Thanks!
          Hide
          jihoonson Jihoon Son added a comment -

          Great work, JaeHwa!
          I also agree with Hyunsik's suggestions.
          I'll wait for the improved patch.
          Thanks!

          Show
          jihoonson Jihoon Son added a comment - Great work, JaeHwa! I also agree with Hyunsik's suggestions. I'll wait for the improved patch. Thanks!
          Hide
          blrunner Jaehwa Jung added a comment -

          Thanks, Jihoon.
          I also agree with Hunsiks' opinions.
          I'll upload the improved patch ASAP.

          Show
          blrunner Jaehwa Jung added a comment - Thanks, Jihoon. I also agree with Hunsiks' opinions. I'll upload the improved patch ASAP.
          Hide
          blrunner Jaehwa Jung added a comment -

          Thanks guys!

          I moved '/tajo-hive' to '/tajo-catalog/tajo-catalog-drivers/tajo-hcatalog'. If catalog store modules located on root directory, it may make many directories and it may look like inefficient. And if we want to add more catalog store, we just need to add catalog store module on 'tajo-catalog-drivers'.

          And I separated catalog drives module from main module. If we execute 'mvn clean install' or 'mvn pacakage -Pdist', tajo doesn't build hcatalog module. If we want to build hcatalog module, we must specify hcatalog version and hadoop version as follows:

          mvn clean package -Pdist -Phcatalog-0.12.0 -Phadoop20 (hive-0.12.0 and hadoop-1.x.x)
          mvn clean package -Pdist -Phcatalog-0.12.0 -Phadoop26 (hive-0.12.0 and hadoop-2.0.6-alpha)
          

          Finally, If users want to use HCatalogStore, users must set HIVE_HOME at tajo-env.sh. If users set HIVE_HOME, tajo shell add hcatalog dependencies to classpath.

          P.S I'll write a wiki document which describe how to integrate with hive this week.

          Show
          blrunner Jaehwa Jung added a comment - Thanks guys! I moved '/tajo-hive' to '/tajo-catalog/tajo-catalog-drivers/tajo-hcatalog'. If catalog store modules located on root directory, it may make many directories and it may look like inefficient. And if we want to add more catalog store, we just need to add catalog store module on 'tajo-catalog-drivers'. And I separated catalog drives module from main module. If we execute 'mvn clean install' or 'mvn pacakage -Pdist', tajo doesn't build hcatalog module. If we want to build hcatalog module, we must specify hcatalog version and hadoop version as follows: mvn clean package -Pdist -Phcatalog-0.12.0 -Phadoop20 (hive-0.12.0 and hadoop-1.x.x) mvn clean package -Pdist -Phcatalog-0.12.0 -Phadoop26 (hive-0.12.0 and hadoop-2.0.6-alpha) Finally, If users want to use HCatalogStore, users must set HIVE_HOME at tajo-env.sh. If users set HIVE_HOME, tajo shell add hcatalog dependencies to classpath. P.S I'll write a wiki document which describe how to integrate with hive this week.
          Hide
          jihoonson Jihoon Son added a comment -

          Nice work!
          I'll review the patch after a while.

          Show
          jihoonson Jihoon Son added a comment - Nice work! I'll review the patch after a while.
          Hide
          hyunsik Hyunsik Choi added a comment -

          +1

          Nice work. In overall, the patch looks great for me.

          In addition, I would like to give some suggestions. tajo-hcatalog.pom file contains various ambiguous hadoop version profiles as follows:

          tajo-hcatalog.pom
          <id>hadoop20</id>
          ...
          <id>hadoop25</id>
          ...
          <id>hadoop26</id>
          

          Actually, the profile names are mismatch to actual hadoop versions. If hadoop-2.0.5's profile id is hadoop25, what will be profile id for hadoop-2.5.0? I think that full version name has more readability. In addition, in pom.xml, hadoop20 indicates 1.1.2. It should be fixed.

          <hadoop20.version>1.1.2</hadoop20.version>
          

          Besides, hadoop version profile should be placed on tajo-project/pom.xml instead of tajo-hcatalog, and Tajo should follow the specified hadoop version. Then, it would be better that tajo-hcatalog follows hadoop version that tajo uses.

          Show
          hyunsik Hyunsik Choi added a comment - +1 Nice work. In overall, the patch looks great for me. In addition, I would like to give some suggestions. tajo-hcatalog.pom file contains various ambiguous hadoop version profiles as follows: tajo-hcatalog.pom <id>hadoop20</id> ... <id>hadoop25</id> ... <id>hadoop26</id> Actually, the profile names are mismatch to actual hadoop versions. If hadoop-2.0.5's profile id is hadoop25, what will be profile id for hadoop-2.5.0? I think that full version name has more readability. In addition, in pom.xml, hadoop20 indicates 1.1.2. It should be fixed. <hadoop20.version>1.1.2</hadoop20.version> Besides, hadoop version profile should be placed on tajo-project/pom.xml instead of tajo-hcatalog, and Tajo should follow the specified hadoop version. Then, it would be better that tajo-hcatalog follows hadoop version that tajo uses.
          Hide
          jihoonson Jihoon Son added a comment -

          +1.
          This is really nice work.
          As Hyunsik mentioned above, it will be great if the problem of ambiguous profile names is solved.

          Show
          jihoonson Jihoon Son added a comment - +1. This is really nice work. As Hyunsik mentioned above, it will be great if the problem of ambiguous profile names is solved.
          Hide
          blrunner Jaehwa Jung added a comment - - edited

          Thanks guys.

          I also agree with Hyunsik's suggestion.
          I updated hadoop dependency to follow hadoop version of 'tajo-project'.

          Check the patch again. please.

          Show
          blrunner Jaehwa Jung added a comment - - edited Thanks guys. I also agree with Hyunsik's suggestion. I updated hadoop dependency to follow hadoop version of 'tajo-project'. Check the patch again. please.
          Hide
          hyunsik Hyunsik Choi added a comment -

          +1

          Ship it!

          Show
          hyunsik Hyunsik Choi added a comment - +1 Ship it!
          Hide
          blrunner Jaehwa Jung added a comment -

          Thanks Hyunsik Choi.
          I've just committed now.

          Show
          blrunner Jaehwa Jung added a comment - Thanks Hyunsik Choi . I've just committed now.
          Hide
          jihoonson Jihoon Son added a comment -

          Great work!
          I'll start TAJO-135.
          Thanks, JaeHwa.

          Show
          jihoonson Jihoon Son added a comment - Great work! I'll start TAJO-135 . Thanks, JaeHwa.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Tajo-trunk-postcommit #610 (See https://builds.apache.org/job/Tajo-trunk-postcommit/610/)
          TAJO-336: Separate catalog stores into separate modules. (jaehwa) (jhjung: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=60f7df201d560e2693c29cd75674f2bccccf9092)

          • tajo-catalog/tajo-catalog-drivers/tajo-hcatalog/src/main/java/org/apache/tajo/catalog/store/HCatalogStore.java
          • tajo-catalog/tajo-catalog-server/pom.xml
          • tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/HCatalogStore.java
          • tajo-catalog/pom.xml
          • tajo-catalog/tajo-catalog-drivers/tajo-hcatalog/src/main/java/org/apache/tajo/catalog/store/HCatalogUtil.java
          • tajo-catalog/tajo-catalog-drivers/tajo-hcatalog/pom.xml
          • tajo-dist/src/main/bin/tajo
          • tajo-dist/src/main/conf/tajo-env.sh
          • tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/HCatalogUtil.java
          • tajo-catalog/tajo-catalog-drivers/pom.xml
          • tajo-dist/pom.xml
          • CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Tajo-trunk-postcommit #610 (See https://builds.apache.org/job/Tajo-trunk-postcommit/610/ ) TAJO-336 : Separate catalog stores into separate modules. (jaehwa) (jhjung: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=60f7df201d560e2693c29cd75674f2bccccf9092 ) tajo-catalog/tajo-catalog-drivers/tajo-hcatalog/src/main/java/org/apache/tajo/catalog/store/HCatalogStore.java tajo-catalog/tajo-catalog-server/pom.xml tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/HCatalogStore.java tajo-catalog/pom.xml tajo-catalog/tajo-catalog-drivers/tajo-hcatalog/src/main/java/org/apache/tajo/catalog/store/HCatalogUtil.java tajo-catalog/tajo-catalog-drivers/tajo-hcatalog/pom.xml tajo-dist/src/main/bin/tajo tajo-dist/src/main/conf/tajo-env.sh tajo-catalog/tajo-catalog-server/src/main/java/org/apache/tajo/catalog/store/HCatalogUtil.java tajo-catalog/tajo-catalog-drivers/pom.xml tajo-dist/pom.xml CHANGES.txt

            People

            • Assignee:
              blrunner Jaehwa Jung
              Reporter:
              hyunsik Hyunsik Choi
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development