Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-3181

testHardLeaseRecoveryAfterNameNodeRestart fails when length before restart is 1 byte less than CRC chunk size

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 2.0.0-alpha
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart seems to be failing intermittently on jenkins.

      org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart
      Failing for the past 1 build (Since Failed#2163 )
      Took 8.4 sec.
      Error Message
      
      Lease mismatch on /hardLeaseRecovery owned by HDFS_NameNode but is accessed by DFSClient_NONMAPREDUCE_1147689755_1  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2076)  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2051)  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:1983)  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:492)  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:311)  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42604)  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:417)  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891)  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1661)  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1657)  at java.security.AccessController.doPrivileged(Native Method)  at javax.security.auth.Subject.doAs(Subject.java:396)  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1205)  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1655) 
      
      Stacktrace
      
      org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch on /hardLeaseRecovery owned by HDFS_NameNode but is accessed by DFSClient_NONMAPREDUCE_1147689755_1
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2076)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2051)
      	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:1983)
      	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:492)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:311)
      	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42604)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:417)
              ...
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
      	at $Proxy15.getAdditionalDatanode(Unknown Source)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:317)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:828)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
      
      1. h3181_20120425.patch
        3 kB
        Tsz Wo Nicholas Sze
      2. h3181_20120426.patch
        5 kB
        Tsz Wo Nicholas Sze
      3. repro.txt
        0.7 kB
        Todd Lipcon
      4. TestLeaseRecovery2with1535.patch
        0.7 kB
        Tsz Wo Nicholas Sze
      5. testOut.txt
        120 kB
        Colin Patrick McCabe

        Issue Links

          Activity

          Tsz Wo Nicholas Sze made changes -
          Component/s test [ 12312916 ]
          Tsz Wo Nicholas Sze made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Fix Version/s 2.0.0 [ 12320353 ]
          Resolution Fixed [ 1 ]
          Tsz Wo Nicholas Sze made changes -
          Link This issue is related to HDFS-1149 [ HDFS-1149 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment h3181_20120426.patch [ 12524660 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment h3181_20120425.patch [ 12524381 ]
          Tsz Wo Nicholas Sze made changes -
          Priority Critical [ 2 ] Minor [ 4 ]
          Assignee Tsz Wo (Nicholas), SZE [ szetszwo ]
          Tsz Wo Nicholas Sze made changes -
          Summary testHardLeaseRecoveryAfterNameNodeRestart fails when length before restart is 1 byte less than block size testHardLeaseRecoveryAfterNameNodeRestart fails when length before restart is 1 byte less than CRC chunk size
          Description org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart seems to be failing intermittently on jenkins.

          {code}
          org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart
          Failing for the past 1 build (Since Failed#2163 )
          Took 8.4 sec.
          Error Message

          Lease mismatch on /hardLeaseRecovery owned by HDFS_NameNode but is accessed by DFSClient_NONMAPREDUCE_1147689755_1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2076) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2051) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:1983) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:492) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:311) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42604) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:417) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1661) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1657) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1205) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1655)

          Stacktrace

          org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch on /hardLeaseRecovery owned by HDFS_NameNode but is accessed by DFSClient_NONMAPREDUCE_1147689755_1
          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2076)
          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2051)
          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:1983)
          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:492)
          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:311)
          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42604)
          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:417)
          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1661)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1657)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:396)
          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1205)
          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1655)

          at org.apache.hadoop.ipc.Client.call(Client.java:1159)
          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:185)
          at $Proxy15.getAdditionalDatanode(Unknown Source)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
          at $Proxy15.getAdditionalDatanode(Unknown Source)
          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:317)
          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:828)
          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
          {code}
          org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart seems to be failing intermittently on jenkins.

          {code}
          org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart
          Failing for the past 1 build (Since Failed#2163 )
          Took 8.4 sec.
          Error Message

          Lease mismatch on /hardLeaseRecovery owned by HDFS_NameNode but is accessed by DFSClient_NONMAPREDUCE_1147689755_1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2076) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2051) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:1983) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:492) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:311) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42604) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:417) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1661) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1657) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1205) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1655)

          Stacktrace

          org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch on /hardLeaseRecovery owned by HDFS_NameNode but is accessed by DFSClient_NONMAPREDUCE_1147689755_1
          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2076)
          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2051)
          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:1983)
          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:492)
          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:311)
          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42604)
          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:417)
                  ...
          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
          at $Proxy15.getAdditionalDatanode(Unknown Source)
          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:317)
          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:828)
          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
          {code}
          Todd Lipcon made changes -
          Attachment repro.txt [ 12521105 ]
          Tsz Wo Nicholas Sze made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Tsz Wo Nicholas Sze made changes -
          Attachment TestLeaseRecovery2with1535.patch [ 12521094 ]
          Todd Lipcon made changes -
          Summary testHardLeaseRecoveryAfterNameNodeRestart fails intermittently testHardLeaseRecoveryAfterNameNodeRestart fails when length before restart is 1 byte less than block size
          Todd Lipcon made changes -
          Affects Version/s 2.0.0 [ 12320353 ]
          Priority Major [ 3 ] Critical [ 2 ]
          Colin Patrick McCabe made changes -
          Summary org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryAfterNameNodeRestart fails intermittently testHardLeaseRecoveryAfterNameNodeRestart fails intermittently
          Colin Patrick McCabe made changes -
          Field Original Value New Value
          Attachment testOut.txt [ 12521089 ]
          Colin Patrick McCabe created issue -

            People

            • Assignee:
              Tsz Wo Nicholas Sze
              Reporter:
              Colin Patrick McCabe
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development