Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1072

AlreadyBeingCreatedException with HDFS_NameNode as the lease holder

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: hdfs-client, namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      TestReadWhileWriting may fail by AlreadyBeingCreatedException with HDFS_NameNode as the lease holder, which indicates that lease recovery is in an inconsistent state.

      1. HDFS-1072.branch-0.22.patch
        3 kB
        Erik Steffl
      2. HDFS-1072.branch-0.22-1.patch
        3 kB
        Erik Steffl

        Issue Links

          Activity

          Hide
          Tsz Wo Nicholas Sze added a comment -

          Created HDFS-1095.

          Show
          Tsz Wo Nicholas Sze added a comment - Created HDFS-1095 .
          Hide
          Konstantin Shvachko added a comment -

          I just committed this. Thank you Erik.

          Show
          Konstantin Shvachko added a comment - I just committed this. Thank you Erik.
          Hide
          Konstantin Shvachko added a comment -

          Agreed, such a test would be useful.

          Show
          Konstantin Shvachko added a comment - Agreed, such a test would be useful.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good to me.

          Future work:
          We should create a test as following:

          1. create a file, write some bytes but not close it.
          2. let hard lease expire
          3. append to the file right after NN lease recovery starts but before it ends; keep retrying.
            • Expecting a RecoveryInProgressException in the first few append calls and succeed as soon as the lease recovery is done.

          Does it make sense?

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good to me. Future work: We should create a test as following: create a file, write some bytes but not close it. let hard lease expire append to the file right after NN lease recovery starts but before it ends; keep retrying. Expecting a RecoveryInProgressException in the first few append calls and succeed as soon as the lease recovery is done. Does it make sense?
          Hide
          Konstantin Shvachko added a comment -

          +1. Looks good.

          Show
          Konstantin Shvachko added a comment - +1. Looks good.
          Hide
          Erik Steffl added a comment -

          Failure in HadoopQa message (08/Apr/10 11:57 PM) is not related to patch, the error is, http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/307/console points to TestHdfsProxy problem: "Test org.apache.hadoop.hdfsproxy.TestHdfsProxy FAILED"

          Show
          Erik Steffl added a comment - Failure in HadoopQa message (08/Apr/10 11:57 PM) is not related to patch, the error is, http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/307/console points to TestHdfsProxy problem: "Test org.apache.hadoop.hdfsproxy.TestHdfsProxy FAILED"
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12441098/HDFS-1072.branch-0.22-1.patch
          against trunk revision 932036.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/307/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/307/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/307/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/307/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12441098/HDFS-1072.branch-0.22-1.patch against trunk revision 932036. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/307/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/307/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/307/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/307/console This message is automatically generated.
          Hide
          Erik Steffl added a comment -

          Patch HDFS-1072.branch-0.22-1.patch fixes the formatting as well as wording of the RecoveryInProgressException message.

          test-patch output:

          [exec] There appear to be 108 release audit warnings before the patch and 108 release audit warnings after applying the patch.
          [exec]
          [exec]
          [exec]
          [exec]
          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 3 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          [exec]

          Show
          Erik Steffl added a comment - Patch HDFS-1072 .branch-0.22-1.patch fixes the formatting as well as wording of the RecoveryInProgressException message. test-patch output: [exec] There appear to be 108 release audit warnings before the patch and 108 release audit warnings after applying the patch. [exec] [exec] [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec]
          Hide
          Konstantin Shvachko added a comment -

          This is the right approach. A couple of nits:

          1. Long lines both in FSNamesystem and the test - should be 80.
          2. The message for RecoveryInProgressException in addition to the holder should include file name (src), but should not guess "(should be namenode)".
          Show
          Konstantin Shvachko added a comment - This is the right approach. A couple of nits: Long lines both in FSNamesystem and the test - should be 80. The message for RecoveryInProgressException in addition to the holder should include file name (src), but should not guess "(should be namenode)".
          Hide
          Erik Steffl added a comment -

          Further investigation revealed that the following sequence leads to AlreadyBeingCreatedException:

          • LEASE_LIMIT=500; cluster.setLeasePeriod(LEASE_LIMIT, LEASE_LIMIT);
          • thread A gets a lease on a file
          • thread B sleeps 2*soft limit
          • thread B tries to get lease on a file, triggers lease recovery and gets RecoveryInProgressException
          • before lease recovery ends, namenode LeaseManager.java:checkLeases finds out that hard limit was also expired, start a new recovery, resets timeouts
          • thread B tries to get lease again, timeout is not expired (it was reset in previous step) so it gets AlreadyBeingCreatedException

          There are two problems in the code that lead to this:

          • hard limit should not be set to such a low value, it makes it very likely for recovery to not finish before it's taken over by another recovery (because of expired hard limit)
          • namenode should recognize that even though limit is not expired the recovery is ongoing and return RecoveryInProgressException instead of AlreadyBeingCreatedException (in FSNamesystem.java:startFileInternal, when it's deciding what to do if the file is under construction)
          Show
          Erik Steffl added a comment - Further investigation revealed that the following sequence leads to AlreadyBeingCreatedException: LEASE_LIMIT=500; cluster.setLeasePeriod(LEASE_LIMIT, LEASE_LIMIT); thread A gets a lease on a file thread B sleeps 2*soft limit thread B tries to get lease on a file, triggers lease recovery and gets RecoveryInProgressException before lease recovery ends, namenode LeaseManager.java:checkLeases finds out that hard limit was also expired, start a new recovery, resets timeouts thread B tries to get lease again, timeout is not expired (it was reset in previous step) so it gets AlreadyBeingCreatedException There are two problems in the code that lead to this: hard limit should not be set to such a low value, it makes it very likely for recovery to not finish before it's taken over by another recovery (because of expired hard limit) namenode should recognize that even though limit is not expired the recovery is ongoing and return RecoveryInProgressException instead of AlreadyBeingCreatedException (in FSNamesystem.java:startFileInternal, when it's deciding what to do if the file is under construction)
          Hide
          Konstantin Shvachko added a comment -

          I see it happening in 0.21 as well.

          Show
          Konstantin Shvachko added a comment - I see it happening in 0.21 as well.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > ... Namenode should never throw AlreadyBeingCreatedException with HDFS_NameNode as the lease holder.

          • If the lease is being recovered, then RecoveryInProgressException should be thrown.
          • If the file is being created, then it should throw AlreadyBeingCreatedException with DFSClient as the lease holder.
          Show
          Tsz Wo Nicholas Sze added a comment - > ... Namenode should never throw AlreadyBeingCreatedException with HDFS_NameNode as the lease holder. If the lease is being recovered, then RecoveryInProgressException should be thrown. If the file is being created, then it should throw AlreadyBeingCreatedException with DFSClient as the lease holder.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I was thinking that this is just a test problem. However, it seems not the case since Namenode should never throw AlreadyBeingCreatedException with HDFS_NameNode as the lease holder.

          Show
          Tsz Wo Nicholas Sze added a comment - I was thinking that this is just a test problem. However, it seems not the case since Namenode should never throw AlreadyBeingCreatedException with HDFS_NameNode as the lease holder.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Here is the stack trace.

          org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:
           failed to create file /TestReadWhileWriting/file1 for DFSClient_-1436751315 on client 127.0.0.1,
           because this file is already being created by HDFS_NameNode on 127.0.0.1
          	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1210)
          	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1305)
          	at org.apache.hadoop.hdfs.server.namenode.NameNode.append(NameNode.java:636)
          	at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342)
          	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1253)
          	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1249)
          	at java.security.AccessController.doPrivileged(Native Method)
          	at javax.security.auth.Subject.doAs(Subject.java:396)
          	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:706)
          	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1247)
          
          	at org.apache.hadoop.ipc.Client.call(Client.java:895)
          	at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
          	at $Proxy6.append(Unknown Source)
          	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
          	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
          	at $Proxy6.append(Unknown Source)
          	at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:701)
          	at org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:212)
          	at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:793)
          	at org.apache.hadoop.hdfs.TestReadWhileWriting.append(TestReadWhileWriting.java:111)
          	at org.apache.hadoop.hdfs.TestReadWhileWriting.pipeline_02_03(TestReadWhileWriting.java:95)
          
          Show
          Tsz Wo Nicholas Sze added a comment - Here is the stack trace. org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /TestReadWhileWriting/file1 for DFSClient_-1436751315 on client 127.0.0.1, because this file is already being created by HDFS_NameNode on 127.0.0.1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1210) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1305) at org.apache.hadoop.hdfs.server.namenode.NameNode.append(NameNode.java:636) at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:342) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1253) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1249) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:706) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1247) at org.apache.hadoop.ipc.Client.call(Client.java:895) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198) at $Proxy6.append(Unknown Source) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy6.append(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:701) at org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:212) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:793) at org.apache.hadoop.hdfs.TestReadWhileWriting.append(TestReadWhileWriting.java:111) at org.apache.hadoop.hdfs.TestReadWhileWriting.pipeline_02_03(TestReadWhileWriting.java:95)

            People

            • Assignee:
              Erik Steffl
              Reporter:
              Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development