Uploaded image for project: 'HCatalog'
  1. HCatalog
  2. HCATALOG-520

Simplify HCatalog dependencies

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • build
    • None

    Description

      Looking through the hcatalog-core dependencies I believe we have an opportunity to trim them down. A major goal of HCatalog is to be a dependency of other processing tools, and we can make that more attractive by invading their classpath as little as possible.

      I believe the following look good (minus hive-exec which is a fat jar, but that's a separate issue):

          <dependency org="org.apache.hadoop" name="hadoop-tools" rev="${hadoop20.version}" conf="default->*"/>
          <dependency org="org.apache.hive" name="hive-builtins" rev="${hive.version}"/>
          <dependency org="org.apache.hive" name="hive-metastore" rev="${hive.version}"/>
          <dependency org="org.apache.hive" name="hive-common" rev="${hive.version}"/>
          <dependency org="org.apache.hive" name="hive-exec" rev="${hive.version}"/>
          <dependency org="org.apache.hive" name="hive-cli" rev="${hive.version}"/>
          <dependency org="org.apache.hive" name="hive-hbase-handler" rev="${hive.version}">
            <exclude org="org.apache.maven.plugins"/>
            <exclude org="org.jruby"/>
          </dependency>
      

      The following are where I believe we can make improvements:

      <dependency org="org.apache.pig" name="pig" rev="${pig.version}" conf="default->*"/>
      

      Pig is still depended on in hcatalog-core tests, but has not yet been moved to the test target. A major goal of switching to subprojects was to stop forcing processing frameworks as dependencies on people using HCat. This should move to the test target (since some core tests use pig for convenience).

      <dependency org="javax.management.j2ee" name="management-api" rev="${javax-mgmt.version}"/>
      

      Does anyone know why management-api is needed? I'm not familiar with this and don't see any usages from a quick grep. Its something JMS-related, and maybe was needed by hcatalog-server-extensions at some point? If tests pass without this I think we should remove it.

      <dependency org="org.codehaus.jackson" name="jackson-mapper-asl" rev="${jackson.version}"/>
      <dependency org="org.codehaus.jackson" name="jackson-core-asl" rev="${jackson.version}"/>
      

      HCatalog build requests jackson 1.7.3, and hive-exec depends on 1.8.8. Any objection to using the versions provided by Hive?

      <dependency org="org.apache.thrift" name="libfb303" rev="${fb303.version}"/>
      

      I don't believe this is required because hive-metastore depends on libfb303.

      <dependency org="commons-dbcp" name="commons-dbcp" rev="${commons-dbcp.version}">
        <exclude module="commons-pool"/>
        <exclude org="org.apache.geronimo.specs" module="geronimo-jta_1.1_spec"/>
      </dependency>
      

      hive-metastore depends on commons-dbcp and I don't believe we need to explicitly depend on this.

      <dependency org="com.google.guava" name="guava" rev="${guava.version}"/>
      

      hive-exec depends on guava 11.0.2 too so I don't believe we need to depend on this.

      Attachments

        1. HCATALOG-520_simfy_deps.1.patch
          86 kB
          Travis Crawford

        Issue Links

          Activity

            People

              traviscrawford Travis Crawford
              traviscrawford Travis Crawford
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: