Description
Race condition is encountered while adding Block to postponedMisreplicatedBlocks which in turn tried to retrieve Block from BlockManager in which it may not be present.
This happens in HA, metasave in first NN succeeded but failed in second NN, StackTrace showing NPE is as follows:
2018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler 24 on 8020, call Call#1 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 172.26.9.163:602342018-07-12 21:39:09,783 WARN org.apache.hadoop.ipc.Server: IPC Server handler 24 on 8020, call Call#1 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 172.26.9.163:60234java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseSourceDatanodes(BlockManager.java:2175) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:830) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:762) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1782) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1766) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:1320) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:928) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)