Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7121

Clean up partitionIds_ member from HdfsTable

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.12.0
    • Impala 2.13.0, Impala 3.1.0
    • Catalog
    • None
    • ghx-label-9

    Description

      HdfsTable already has a number of internal structures that meant to speed-up processes like partition pruning. partitionIds_ is a HashSet of partition IDs but apparently we already have this information in partitionMap_ that is a mapping between partition IDs and HdfsPartitions. As a result we can simply drop partitionsIds_ and modify getPartitionIds() to return partitionMap_.keySet().

      This is not expected to introduce regression for the following reasons:

      • HashMap.keySet() is O(1) complex as it returns a wrapper around an internal set of keys from the HashMap.
      • We have to be careful not to modify this keySet() returned from getPartitionIds() because that would also alter the partitionMap_ member. This is safe as all callsites of getPartitionIds() immediately copies the items of the set to a separate set.

      Attachments

        Activity

          People

            gaborkaszab Gabor Kaszab
            gaborkaszab Gabor Kaszab
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: