Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
-
None
Description
// 1. Check that all locations are different. // 2. Count locations on different racks. Set<String> racks = new TreeSet<>(); for (DatanodeInfo dn : locs) racks.add(dn.getNetworkLocation()); ... racks.size()
Here, the code is counting the number of distinct Network Locations. However, it is using a TreeSet which has overhead to maintain item order and uses a linked structure internally. This overhead is unneeded since all that is required here is a count.
A NavigableSet implementation based on a TreeMap. The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.
This implementation provides guaranteed log time cost for the basic operations (add, remove and contains).https://docs.oracle.com/javase/7/docs/api/java/util/TreeSet.html
Use Java streams for readability and because it uses a HashSet under the covers to perform the distinct action. HashSet uses an array internally and has constant time performance for the add method.
Attachments
Attachments
Issue Links
- is related to
-
HDFS-15783 Speed up BlockPlacementPolicyRackFaultTolerant#verifyBlockPlacement
- Resolved