Standby/Observer NameNode should not handle redundant replica block logic when set decrease replication.
At present, when call setReplication to execute the logic of decrease replication,
- ActiveNameNode will call the BlockManager#processExtraRedundancyBlock method to select the dn of the redundant replica , will add to the excessRedundancyMap and add to invalidateBlocks (RedundancyMonitor will be scheduled to delete the block on dn).
- Then the StandyNameNode or ObserverNameNode load editlog and apply the SetReplicationOp, if the dn of the replica to be deleted has not yet performed incremental block report,
here also will BlockManager#processExtraRedundancyBlock method be called here to select the dn of the redundant replica and add it to the excessRedundancyMap (here selected the redundant dn may be inconsistent with the dn selected in the active namenode).
In excessRedundancyMap exist dn maybe affects the dn decommission, resulting can not to complete decommission dn operation in Standy/ObserverNameNode.
The specific cases are as follows:
For example a file is 3 replica (d1,d2,d3) and call setReplication set file to 2 replica.
- ActiveNameNode select d1 with redundant replicas to add toexcessRedundancyMap and invalidateBlocks.
- StandyNameNode replays SetReplicationOp (at this time, d1 has not yet executed incremental block report), so here maybe selected redundant replica dn are inconsistent with ActiveNameNode, such as select d2 to add excessRedundancyMap.
- At this time, d1 completes deleting the block for incremental block report.
- The DN list for this block in ActiveNameNode includes d2 and d3 (delete d1 from in the excessRedundancyMap when processing the incremental block report ).
- The DN list for this block in StandyNameNode includes d2 and d3 (can not delete d2 from in the excessRedundancyMap when processing the incremental block report).
At this time, execute the decommission operation on d3.
- ActiveNameNode will select a new node d4 to copy the replica, and d4 will run incrementally block report.
- The DN list for this block in ActiveNameNode includes d2 and d3(decommissioning status),d4, then d3 can to decommissioned normally.
- The DN list for this block in StandyNameNode is d3 (decommissioning status), d2 (redundant status), d4.
since the requirements for two live replica are not met, d3 cannot be decommissioned at this time.
Therefore, StandyNameNode or ObserverNameNode considers not process redundant replicas logic when call setReplication.