Description
HDFS-14053 introduced a method "processMisReplicatedBlocks" to the blockManager, and it is used by fsck to schedule mis-replicated blocks for replication.
The method should return a the number of blocks it processed, but it always returns zero as "processed" is never incremented in the method.
It should also drop and re-take the write lock every "numBlocksPerIteration" but as processed is never incremented, it will never drop and re-take the write lock, giving potential for holding the write lock for a long time.
public int processMisReplicatedBlocks(List<BlockInfo> blocks) { int processed = 0; Iterator<BlockInfo> iter = blocks.iterator(); try { while (isPopulatingReplQueues() && namesystem.isRunning() && !Thread.currentThread().isInterrupted() && iter.hasNext()) { int limit = processed + numBlocksPerIteration; namesystem.writeLockInterruptibly(); try { while (iter.hasNext() && processed < limit) { BlockInfo blk = iter.next(); MisReplicationResult r = processMisReplicatedBlock(blk); LOG.debug("BLOCK* processMisReplicatedBlocks: " + "Re-scanned block {}, result is {}", blk, r); } } finally { namesystem.writeUnlock(); } } } catch (InterruptedException ex) { LOG.info("Caught InterruptedException while scheduling replication work" + " for mis-replicated blocks"); Thread.currentThread().interrupt(); } return processed; }
Due to this, fsck causes a warning to be logged in the NN for every mis-replicated file it schedules replication for, as it checks the processed count:
2019-07-10 15:46:14,790 WARN namenode.NameNode: Fsck: Block manager is able to process only 0 mis-replicated blocks (Total count : 1 ) for path /...
Attachments
Attachments
Issue Links
- is related to
-
HDFS-14053 Provide ability for NN to re-replicate based on topology changes.
- Resolved