Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
First, some background info: A non-native table can be created with partition columns defined. However, the existence of partition columns for a non-native table is problematic when using HCatInputFormat. Nothing disallows the table creation, and the documentation [1] does not mention that non-native tables cannot have partition columns. In fact, it suggests that "PARTITIONED BY" can be specified.
With such a table definition, for any job using HCatInputFormat no data can ever be read and the cause is not immediately obvious, only revealed via debugging. The bug stems from the org.apache.hive.hcatalog.mapreduce.InitializeInput class's logic in the getInputJobInfo method, where it attempts to identify the partitions to read. With partition columns defined, table.getPartitionKeys().size() is > 0 so it proceeds to the listPartitionsByFilter(...) code which will never find any partitions, because partitions cannot be added to a non-native table (HIVE-1223). The returned InputJobInfo then has an empty List<PartInfo> set rather than taking the "Non partitioned table" path where the table's StorageDescriptor and parameters are used to build a singleton PartInfo to use.
This bug is quite similar to HIVE-18087 although it resides in a different layer of Hive.
We encountered this using the HBaseStorageHandler, although I don't believe that's a particularly relevant detail.
[1] https://cwiki.apache.org/confluence/display/Hive/StorageHandlers#StorageHandlers-DDL
Attachments
Issue Links
- relates to
-
HIVE-18086 NullPointerException initializing query job when non-native table has partition columns
- Open
-
HIVE-18087 Simple select query finds nothing when non-native table has partition columns
- Open
-
HIVE-1223 support partitioning for non-native tables
- Open