Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.0
-
None
-
None
Description
query such as "select count(1) from tpcds_10_parquet.store_returns where sr_returned_date_sk=2452802" return row count as 0 even though partition has enough rows in it.
upon investigation, I found that partition list passed during the StatsUtils.getNumRows is of zero size.
https://github.com/apache/hive/blob/ccaf783a198e142b408cb57415c4262d27b45831/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L438-L439
it seems partitionlist is retrieved during PartitionPruner
https://github.com/apache/hive/blob/36bf7f00731e3b95af3e5eeaa4ce39b375974a74/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java#L439
Hive serialized this filter expression using Kryo before passing to HMS
on the server-side if the hive.metastore.expression.proxy set to MsckPartitionExpressionProxy it tries to convert this expression into the string
because of this bad filter expression hive did not retrieve any partitions, I think to make it work hive should try to deserialize it similar to PartitionExpressionForMetastore.