Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.2
-
None
-
None
Description
We use follow sql to create hive external table , which read from hbase
CREATE EXTERNAL TABLE if not exists dev.sanyu_spotlight_headline_material( rowkey string COMMENT 'HBase主键', content string COMMENT '图文正文') USING HIVE ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( 'hbase.columns.mapping'=':key, cf1:content' ) TBLPROPERTIES ( 'hbase.table.name'='spotlight_headline_material' );
But the sql failed in Spark 3.1.2, which throw this exception
21/09/27 11:44:24 INFO scheduler.DAGScheduler: Asked to cancel job group 26d7459f-7b58-4c18-9939-5f2737525ff2 21/09/27 11:44:24 ERROR thriftserver.SparkExecuteStatementOperation: Error executing query with 26d7459f-7b58-4c18-9939-5f2737525ff2, currentState RUNNING, org.apache.spark.sql.catalyst.parser.ParseException: Operation not allowed: Unexpected combination of ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' and STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITHSERDEPROPERTIES('hbase.columns.mapping'=':key, cf1:content')(line 5, pos 0)
this check was introduced from this change: https://github.com/apache/spark/pull/28026
Could anyone gave the introduction how to create the external table for hbase in spark3 now ?