Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
TableInputFormatBase's getSplits() method instantiates a new RegionSizeCalculator every time. Instantiating a RegionSizeCalculator involves scanning for all regionlocations for a given table in meta. This can be costly for large tables, and we don't know how often a subclass will call getSplits().
When initializeTable is called, we already cache the RegionLocator and Admin that are used for passing into the RegionSizeCalculator. We should similarly cache the RegionSizeCalculator itself at that same time to avoid unnecessary meta scans on repeat getSplits() calls.
Attachments
Issue Links
- links to