Details
Description
There have been improvements like HDFS-14171 and HDFS-14632 to the performance issue caused from getNumLiveDataNodes calls per block. The improvement has been only done w.r.t dfs.namenode.safemode.min.datanodes paramter being set to 0 or not.
private boolean areThresholdsMet() { assert namesystem.hasWriteLock(); - int datanodeNum = blockManager.getDatanodeManager().getNumLiveDataNodes(); + // Calculating the number of live datanodes is time-consuming + // in large clusters. Skip it when datanodeThreshold is zero. + int datanodeNum = 0; + if (datanodeThreshold > 0) { + datanodeNum = blockManager.getDatanodeManager().getNumLiveDataNodes(); + } synchronized (this) { return blockSafe >= blockThreshold && datanodeNum >= datanodeThreshold; }
I feel above logic would create similar situation of un-necessary evaluations of getNumLiveDataNodes when dfs.namenode.safemode.min.datanodes paramter is set > 0 even though "blockSafe >= blockThreshold" is false for most of the time in NN startup safe mode. We could do something like below to avoid this
private boolean areThresholdsMet() { assert namesystem.hasWriteLock(); synchronized (this) { return blockSafe >= blockThreshold && (datanodeThreshold > 0)? blockManager.getDatanodeManager().getNumLiveDataNodes() >= datanodeThreshold : true; } }
Attachments
Attachments
Issue Links
- relates to
-
HDFS-15594 Lazy calculate live datanodes in safe mode tip
- Resolved
- supercedes
-
HDFS-14171 Performance improvement in Tailing EditLog
- Resolved