Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.12.0
-
None
-
ghx-label-9
Description
HdfsTable already has a number of internal structures that meant to speed-up processes like partition pruning. partitionIds_ is a HashSet of partition IDs but apparently we already have this information in partitionMap_ that is a mapping between partition IDs and HdfsPartitions. As a result we can simply drop partitionsIds_ and modify getPartitionIds() to return partitionMap_.keySet().
This is not expected to introduce regression for the following reasons:
- HashMap.keySet() is O(1) complex as it returns a wrapper around an internal set of keys from the HashMap.
- We have to be careful not to modify this keySet() returned from getPartitionIds() because that would also alter the partitionMap_ member. This is safe as all callsites of getPartitionIds() immediately copies the items of the set to a separate set.