One of our Hive tables is backed up by Hbase (HBaseStorageHandler), to simulate the partitioned Hive Table by "DataDate", we use composite rowkey in Hbase, e.g. DataDate_Userid_Actionid_Timestamp. The example rowkey is as follow.
However, it seems Hive does not support "partial rowkey scan". For example I want to get all data that were generated on 06/01/2014, so I issue the following Hive query, but Hive returns nothing.
select * from table where DataDate="20140601";
After several attempts, I found that I have to give exact row key (e.g. 20140601_784353454_20123282_1401632522132) so that Hive can find that record.
The reason I want to see the "partial rowkey scan" feature is because: in Hbase, partial table scan should have better performance than full table scan.
Is there any plan in Hive community to support "partial rowkey scan" in near future?