ashutoshc requested code review of "HIVE-538 [jira] make hive_jdbc.jar self-containing".
Reviewers: JIRA
https://issues.apache.org/jira/browse/HIVE-538
This patch introduces two new targets:
a) jar-jdbc-combined : This target generates a jar file containing all the hive jars required for jdbc driver.
b) jar-jdbc-rt-deps : This target generates a jar file which contains all the hive runtime dependcies in a single jar.
Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are required in the classpath to run jdbc applications on hive. We need to do atleast the following to get rid of most unnecessary dependencies:
1. get rid of dynamic serde and use a standard serialization format, maybe tab separated, json or avro
2. dont use hadoop configuration parameters
3. repackage thrift and fb303 classes into hive_jdbc.jar
TEST PLAN
EMPTY
REVISION DETAIL
https://reviews.facebook.net/D2553
AFFECTED FILES
build.xml
ivy/libraries.properties
ivy.xml
MANAGE HERALD DIFFERENTIAL RULES
https://reviews.facebook.net/herald/view/differential/
WHY DID I GET THIS EMAIL?
https://reviews.facebook.net/herald/transcript/5799/
Tip: use the X-Herald-Rules header to filter Herald messages in your client.
From a purely empirical approach it appears that the following jars are currently required to use the hive JDBC driver (version 0.5.0):
I propose modifying the build process to combine the classes from the first set of jars into one single jar. That way users only need to add the hadoop-core jar and and hive-jdbc-combined jar to their classpath. As other dependencies are removed or refactored away, we could thin out what goes in the jar.
I can take on this JIRA if others agree with the approach.