Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-3366

Some stacktraces are now too lengthy and sometimes no good

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Invalid
    • 2.0.0-alpha
    • None
    • None
    • None

    Description

      This is a high-on-nitpick ticket for the benefit of troubleshooting.

      This is partially related to all the PB-changes we've had. And also partially related to Java/JVMs.

      Take a case of an AccessControlException, which is pretty common in HDFS permissions layer. We now get, due to several more calls added at the RPC layer for PB (or maybe something else, if am mistaken):

      Caused by: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=yarn, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
      	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
      	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:186)
      	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:135)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4204)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4175)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2565)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2529)
      	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:640)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:412)
      	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42618)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:448)
      	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891)
      	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1661)
      	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1657)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1204)
      	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1655)
      
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:205)
      	at $Proxy10.mkdirs(Unknown Source)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:601)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:165)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:84)
      	at $Proxy10.mkdirs(Unknown Source)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:430)
      	at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1717)
      	... 9 more
      

      The "9 more" is what I was looking for, to identify the caller to debug on/find the exact directory. However it now gets eaten away cause just the mkdir-to-exception trace itself has grown quite a bit. Comparing this to 0.20, we have much fewer calls and that helps us see at least the real caller of mkdirs.

      I'm actually not sure what causes Java to print "... X more" in these form of exception prints, but if thats controllable am all in favor of increasing its amount for HDFS (using new default java opts?). So that when an exception does occur, we don't get a nearly-unusable stacktrace.

      Attachments

        Activity

          People

            Unassigned Unassigned
            qwertymaniac Harsh J
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: