We are trying to read data from hive tables using Spark SQL but are unable to do so. Following are the steps we are following to achieve the same:-
- Created certain tables in Hive 3.1.0 and linked them to the tables of Hbase 2.0.0 using HbaseStorageHandler SerDe.
- All the configuration related to hive including spark.sql.warehouse.dir, thrift server uri, zookeeper details etc. are being provided using SparkConf.
- We are then creating the Spark Session as SparkSession spark = SparkSession.builder().config(conf).enableHiveSupport()
- Then using SQLContext we are trying to read data from the hive table by:- sqlContext.sql("select * from db_name.table_name").show();
At this step we are facing the error as:- java.lang.ClassNotFoundException Class org.apache.hadoop.hive.hbase.HBaseSerDe not found (Full logs attached)
We are including the hive-hbase-handler jar and all the other required jars in our commonLib and specifying the absolute path to our commonLib using the --jars option in our spark-submit, yet we are unable to find a wayout to resolve this error.
We read in the Spark's official documentation that it is still supporting upto Hive 2.1. So is there another way to connect to Hive 3.0 using Spark 2.3?