Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.13.0
-
ghx-label-8
Description
In HdfsTable.updateUnpartitionedTableFileMd() the existing default Partition object is reset, and a new empty one is created. It then calls refreshPartitionFileMetadata with this new partition which has an empty list of file descriptors. This ends up listing the directory, and for each file, since it doesn't find it in the empty descriptor list, will make a separate RPC to HDFS to get the locations.
This is quite wasteful vs just using the API that returns the located statuses for the directory.
Alternatively, it seems like it should probably keep around the old file descriptor list in the new Partition object so that the incremental refresh path can work.