Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
ghx-label-13
Description
In table level REFRESH, we check whether the partition is actually changed and skip updating unchanged partitions in catalog:
https://github.com/apache/impala/blob/42fda24364786cc1a457890bd212bb3922479e95/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1098-L1101
public void updatePartition(HdfsPartition.Builder partBuilder) throws CatalogException { HdfsPartition oldPartition = partBuilder.getOldInstance(); ... boolean partitionNotChanged = partBuilder.equalsToOriginal(oldPartition); LOG.trace("Partition {} {}", oldPartition.getName(), partitionNotChanged ? "changed" : "unchanged"); if (partitionNotChanged) return; HdfsPartition newPartition = partBuilder.build(); // Partition is reloaded and hence cache directives are not dropped. dropPartition(oldPartition, false); addPartition(newPartition); }
However, in partition REFRESH, we always drop and add the partition:
https://github.com/apache/impala/blob/42fda24364786cc1a457890bd212bb3922479e95/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L3093-L3096
for (Map.Entry<HdfsPartition.Builder, HdfsPartition> entry : partBuilderToPartitions.entrySet()) { if (entry.getValue() != null) { dropPartition(entry.getValue(), false); } addPartition(entry.getKey().build()); }
We should add the same check to avoid updating unchanged partitions.