Attached is a preliminary patch that addresses a bunch of TODOs. This doesn't address all of them, but since this is sort of a "grab bag" issue I wanted to put it up for early review now, and will add some more deltas on top in the next couple days.
Summary of changes:
.../src/main/java/org/apache/hadoop/ipc/RPC.java | 6 ++
.../java/org/apache/hadoop/hdfs/DFSClient.java | 15 +----
.../ClientDatanodeProtocolTranslatorPB.java | 9 ++-
Fixes a bunch of warnings I saw in the logs where RPC.stopProxy() was being called on translator objects. This would partially apply to trunk, but some of the HA changes we've done made it worse here, and would conflict anyway, so addressed in this patch.
.../hdfs/server/blockmanagement/BlockManager.java | 1 -
Remove unused import
.../hdfs/server/datanode/BPOfferService.java | 14 +---
- Remove a TODO about a potential synchronization problem - we we haven't seen it in practice, and it's only in shutdown, so I'm not too concerned.
- remove getNNSocketAddress since a BPOS now has multiple NNs associated.
- undeprecate getActiveNN, remove the TODO, since it's actually used by block synchronization (one of the few cases where we want to RPC directly to whichever NN is active and not the other). This is also used by the (now-abandoned) "distributed upgrade" code.
.../hdfs/server/datanode/BPServiceActor.java | 4 +-
Remove TODO about the way that addBlockPool is called. It strikes me as weird that we initialize the scanner after every heartbeat, but it's not related to HA.
.../hdfs/server/datanode/BlockPoolManager.java | 10 —
Remove getter which takes an NN address, since a block pool can correspond to multiple NN addresses.
.../hadoop/hdfs/server/datanode/DataNode.java | 71 ++++++++++++--------
- refactor cases where errorReport was directly called on an NN proxy to be passed through the DataNode->BPOS path more centrally. errorReports should go to both nodes.
- remove getNameNodeAddr(bpid) since again, a BP may have multiple NN addresses
- rename getBPNamenode to getActiveNamenodeForBP to clarify the result in the case of HA
- fix getNamenodeAddresses (part of JMX bean) to properly include all HA NNs
- rename isBPServiceAlive to isConnectedToNN to clarify its purpose - this is only used by MiniDFS currently, in order to determine when each DN has connected to the NNs in the cluster.
.../server/datanode/UpgradeManagerDatanode.java | 2 +-
.../server/datanode/UpgradeObjectDatanode.java | 11 +---
(updates based on above method renames)
.../hadoop/hdfs/server/namenode/FSDirectory.java | 15 ++---
- remove a TODO about a useless lock - was taking a writeLock despite asserting just above that the writeLock was already held
.../hdfs/server/namenode/FSEditLogLoader.java | 4 +-
- remove a TODO and replace it with an explanation about why the code is counter-intuitive there
.../hadoop/hdfs/server/namenode/FSImage.java | 4 -
- remove TODO about never saving a checkpoint on startup in HA. We addressed this issue with
- remove TODO about quota tracking being slow - filed HDFS-2989
.../hadoop/hdfs/server/namenode/FSNamesystem.java | 2 -
.../hadoop/hdfs/server/namenode/NameNode.java | 1 -
Removed three TODOs which were really more like idle design thoughts, don't seme to be relevant.
.../org/apache/hadoop/hdfs/MiniDFSCluster.java | 6 +-
.../hdfs/server/datanode/TestBlockRecovery.java | 6 +-
.../TestDataNodeMultipleRegistrations.java | 20 ++++--
.../hdfs/server/datanode/TestRefreshNamenodes.java | 28 +++++---
Just tracking renames above
.../server/namenode/ha/TestPipelinesFailover.java | 4 +-
.../server/namenode/ha/TestStandbyCheckpoints.java | 3 -
Stray TODOs which were already addressed previously