Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.2.0
-
None
-
Reviewed
Description
We've received reports that the NameNode can get a NullPointerException when the topology script is missing. This issue tracks investigating whether or not we can improve the validation logic and give a more informative error message.
Here is a sample stack trace :
Getting NPE from HDFS:
2015-02-06 23:02:12,250 ERROR [pool-4-thread-1] util.HFileV1Detector: Got exception while reading trailer for file:hdfs://hqhd02nm01.pclc0.merkle.local:8020/hbase/.META./1028785192/info/1490a396aea448b693da563f76a28486^M
org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException^M
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:359)^M
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1789)^M
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)^M
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)^M
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)^M
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)^M
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)^M
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)^M
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)^M
at java.security.AccessController.doPrivileged(Native Method)^M
at javax.security.auth.Subject.doAs(Subject.java:415)^M
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)^M
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)^M
^M
at org.apache.hadoop.ipc.Client.call(Client.java:1468)^M
at org.apache.hadoop.ipc.Client.call(Client.java:1399)^M
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)^M
at com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source)^M
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:254)^M
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)^M
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)^M
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)^M
at java.lang.reflect.Method.invoke(Method.java:606)^M
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)^M
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)^M
at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source)^M
at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1220)^M
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1210)^M
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1200)^M
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:271)^M
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:238)^M
at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:231)^M
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1498)^M
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:302)^M
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:298)^M
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)^M
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:298)^M
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)^M
at org.apache.hadoop.hbase.util.HFileV1Detector$1.call(HFileV1Detector.java:320)^M
at org.apache.hadoop.hbase.util.HFileV1Detector$1.call(HFileV1Detector.java:300)^M
at java.util.concurrent.FutureTask.run(FutureTask.java:262)^M
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)^M
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)^M
at java.lang.Thread.run(Thread.java:745)^M
2015-02-06 23:02:12,263 ERROR [pool-4-thread-1] util.HFileV1Detector: Got exception while reading trailer for file:hdfs://hqhd02nm01.pclc0.merkle.local:8020/hbase/.META./1028785192/info/a06f2483f6864d818884d0a451cb91d5^M
org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException^M
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:359)^M
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1789)^M
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)^M
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)^M
Attachments
Attachments
Issue Links
- is duplicated by
-
HDFS-8396 NPE exception can be thrown in DatanodeManager#sortLocatedBlocks
- Resolved
Handles the null returned by the ShellExecutor correctly in case of busted shell scripts.
Added test scripts with correct and incorrect handling of topology