Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.14.0, 2.3.2
-
None
Description
Currently the hcat command adds HBase jars to the classpath by using find to walk the directories under $HBASE_HOME/lib.
# Look for HBase in a BigTop-compatible way. Avoid thrift version # conflict with modern versions of HBase. HBASE_HOME=${HBASE_HOME:-"/usr/lib/hbase"} HBASE_CONF_DIR=${HBASE_CONF_DIR:-"${HBASE_HOME}/conf"} if [ -d ${HBASE_HOME} ] ; then for jar in $(find $HBASE_HOME -name '*.jar' -not -name '*thrift*'); do HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar} done export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:${HBASE_CLASSPATH}" fi if [ -d $HBASE_CONF_DIR ] ; then HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:${HBASE_CONF_DIR}" fi
This is incorrect as that path contains jars for a mixture of purposes; hbase client jars, hbase server jars, and hbase shell specific jars. The inclusion of unneeded jars is mostly innocuous until the upcoming HBase 2.1.0 release. That release will have HBASE-20615 and HBASE-19735, which will mean most client facing installations will have a number of shaded client artifacts present.
With those changes in place, the current implementation will include in the hcat runtime a mix of shaded and non-shaded hbase artifacts that include some Hadoop classes rewritten to use a shaded version of protobuf. When these mix with other Hadoop classes in the classpath that have not been rewritten hcat will fail with errors that look like this:
Exception in thread "main" java.lang.ClassCastException: org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetFileInfoRequestProto cannot be cast to org.apache.hadoop.hbase.shaded.com.google.protobuf.Message at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:225) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy.$Proxy28.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:875) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) at com.sun.proxy.$Proxy29.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1643) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1495) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1492) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1507) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1668) at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:686) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:625) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:557) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:524) at org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:149) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:313) at org.apache.hadoop.util.RunJar.main(RunJar.java:227)
To fix this issue in a way that transparently handles HBase versions the hcat command should take the same approach as the hive cli and ask HBase for the appropriate set of jars:
# HBase detection. Need bin/hbase and a conf dir for building classpath entries. # Start with BigTop defaults for HBASE_HOME and HBASE_CONF_DIR. HBASE_HOME=${HBASE_HOME:-"/usr/lib/hbase"} HBASE_CONF_DIR=${HBASE_CONF_DIR:-"/etc/hbase/conf"} if [[ ! -d $HBASE_CONF_DIR ]] ; then # not explicitly set, nor in BigTop location. Try looking in HBASE_HOME. HBASE_CONF_DIR="$HBASE_HOME/conf" fi # perhaps we've located the HBase config. if so, include it on classpath. if [[ -d $HBASE_CONF_DIR ]] ; then export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:${HBASE_CONF_DIR}" fi # look for the hbase script. First check HBASE_HOME and then ask PATH. if [[ -e $HBASE_HOME/bin/hbase ]] ; then HBASE_BIN="$HBASE_HOME/bin/hbase" fi HBASE_BIN=${HBASE_BIN:-"$(which hbase)"} # perhaps we've located HBase. If so, include its details on the classpath if [[ -n $HBASE_BIN ]] ; then # exclude ZK, PB, and Guava (See HIVE-2055) # depends on HBASE-8438 (hbase-0.94.14+, hbase-0.96.1+) for `hbase mapredcp` command for x in $($HBASE_BIN mapredcp 2>&2 | tr ':' '\n') ; do if [[ $x == *zookeeper* || $x == *protobuf-java* || $x == *guava* ]] ; then continue fi # TODO: should these should be added to AUX_PARAM as well? export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:${x}" done fi
Attachments
Attachments
Issue Links
- is broken by
-
HIVE-6695 bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]
- Closed
- is related to
-
HBASE-20615 emphasize use of shaded client jars when they're present in an install
- Resolved
-
HBASE-19735 Create a minimal "client" tarball installation
- Resolved