Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-3664

BlockManager race when stopping active services

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 2.0.2-alpha
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Looks like there's a race hit by the test where we try to do block management while shutting down (eg allocate a block or compute replication work after BlocksMap#close, resulting in an NPE).

      1. HDFS-3664.001.patch
        3 kB
        Colin Patrick McCabe
      2. HDFS-3664.002.patch
        3 kB
        Colin Patrick McCabe

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1194 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1194/)
          HDFS-3664. BlockManager race when stopping active services. Contributed by Colin Patrick McCabe (Revision 1383753)

          Result = SUCCESS
          eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383753
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1194 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1194/ ) HDFS-3664 . BlockManager race when stopping active services. Contributed by Colin Patrick McCabe (Revision 1383753) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383753 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1163 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1163/)
          HDFS-3664. BlockManager race when stopping active services. Contributed by Colin Patrick McCabe (Revision 1383753)

          Result = FAILURE
          eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383753
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1163 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1163/ ) HDFS-3664 . BlockManager race when stopping active services. Contributed by Colin Patrick McCabe (Revision 1383753) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383753 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #2747 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2747/)
          HDFS-3664. BlockManager race when stopping active services. Contributed by Colin Patrick McCabe (Revision 1383753)

          Result = FAILURE
          eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383753
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2747 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2747/ ) HDFS-3664 . BlockManager race when stopping active services. Contributed by Colin Patrick McCabe (Revision 1383753) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383753 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #2786 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2786/)
          HDFS-3664. BlockManager race when stopping active services. Contributed by Colin Patrick McCabe (Revision 1383753)

          Result = SUCCESS
          eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383753
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2786 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2786/ ) HDFS-3664 . BlockManager race when stopping active services. Contributed by Colin Patrick McCabe (Revision 1383753) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383753 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #2723 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2723/)
          HDFS-3664. BlockManager race when stopping active services. Contributed by Colin Patrick McCabe (Revision 1383753)

          Result = SUCCESS
          eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383753
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2723 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2723/ ) HDFS-3664 . BlockManager race when stopping active services. Contributed by Colin Patrick McCabe (Revision 1383753) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1383753 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
          Hide
          Eli Collins added a comment -

          I've committed this and merged to branch-2 and branch-2.0.2-alpha. Thanks Colin.

          Show
          Eli Collins added a comment - I've committed this and merged to branch-2 and branch-2.0.2-alpha. Thanks Colin.
          Hide
          Eli Collins added a comment -

          Good catch. +1 (test failure is still unrelated)

          Show
          Eli Collins added a comment - Good catch. +1 (test failure is still unrelated)
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12544746/HDFS-3664.002.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.TestDatanodeBlockScanner

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3175//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3175//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12544746/HDFS-3664.002.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDatanodeBlockScanner +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3175//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3175//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -

          Not your change but in BlockManager we should check datanodeManager and blocksMap for null like we do pendingReplications right?

          Hmm. All of those fields are declared final and unconditionally initialized to something non-null in the constructor, so I don't think there should be any null checks in the close function. I'll post a patch in a sec.

          Show
          Colin Patrick McCabe added a comment - Not your change but in BlockManager we should check datanodeManager and blocksMap for null like we do pendingReplications right? Hmm. All of those fields are declared final and unconditionally initialized to something non-null in the constructor, so I don't think there should be any null checks in the close function. I'll post a patch in a sec.
          Hide
          Eli Collins added a comment -

          Change looks good. And the test failures look unrelated, TestDatanodeBlockScanner is known and looks like its still running java process is locking the storage dir causing TestDatanodeBlockScanner to fail. I verified TestBlockMissingException with your patch for sanity.

          Not your change but in BlockManager we should check datanodeManager and blocksMap for null like we do pendingReplications right?

          Show
          Eli Collins added a comment - Change looks good. And the test failures look unrelated, TestDatanodeBlockScanner is known and looks like its still running java process is locking the storage dir causing TestDatanodeBlockScanner to fail. I verified TestBlockMissingException with your patch for sanity. Not your change but in BlockManager we should check datanodeManager and blocksMap for null like we do pendingReplications right?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12544723/HDFS-3664.001.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.TestBlockMissingException
          org.apache.hadoop.hdfs.TestDatanodeBlockScanner

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3173//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3173//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12544723/HDFS-3664.001.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestBlockMissingException org.apache.hadoop.hdfs.TestDatanodeBlockScanner +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3173//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3173//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -
          • Previously, we were setting blocksMap to null before calling datanodeManager#close. However, DatanodeManager#decommissionThread accesses blocksMap at various times while the thread runs. Instead, wait until the decommissionThread has been stopped before closing blocksMap.
          • We also previously were not joining DatanodeManager#decommissionThread or DatanodeManager#heartbeatManager, so there was no guarantee those threads would have exited before blocksMap was nulled. Let's join these threads.
          Show
          Colin Patrick McCabe added a comment - Previously, we were setting blocksMap to null before calling datanodeManager#close . However, DatanodeManager#decommissionThread accesses blocksMap at various times while the thread runs. Instead, wait until the decommissionThread has been stopped before closing blocksMap. We also previously were not joining DatanodeManager#decommissionThread or DatanodeManager#heartbeatManager , so there was no guarantee those threads would have exited before blocksMap was nulled. Let's join these threads.
          Hide
          Eli Collins added a comment -

          Yea, this is a separate issue. Looks like we're racing between BM shutdown and an outstanding request.

          Show
          Eli Collins added a comment - Yea, this is a separate issue. Looks like we're racing between BM shutdown and an outstanding request.
          Hide
          Andy Isaacson added a comment -

          May be a dupe of HDFS-3048.

          We've merged that patch and are still seeing this NPE (for example HDFS-3822), so it's not a dupe.

          Show
          Andy Isaacson added a comment - May be a dupe of HDFS-3048 . We've merged that patch and are still seeing this NPE (for example HDFS-3822 ), so it's not a dupe.
          Hide
          Eli Collins added a comment -

          May be a dupe of HDFS-3048.

          Show
          Eli Collins added a comment - May be a dupe of HDFS-3048 .
          Hide
          Eli Collins added a comment -

          Same NPE in TestListCorruptFileBlocks.testListCorruptFileBlocksInSafeMode.

          2012-07-15 21:48:39,685 INFO namenode.FSNamesystem (FSNamesystem.java:stopActiveServices(673)) - Stopping services started for active state
          2012-07-15 21:48:39,686 FATAL blockmanagement.BlockManager (BlockManager.java:run(2969)) - ReplicationMonitor thread received Runtime exception.
          java.lang.NullPointerException
          at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.getBlockCollection(BlocksMap.java:98)
          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1097)
          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1072)
          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3000)
          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:2962)
          at java.lang.Thread.run(Thread.java:662)
          2012-07-15 21:48:39,685 WARN blockmanagement.DecommissionM

          Show
          Eli Collins added a comment - Same NPE in TestListCorruptFileBlocks.testListCorruptFileBlocksInSafeMode. 2012-07-15 21:48:39,685 INFO namenode.FSNamesystem (FSNamesystem.java:stopActiveServices(673)) - Stopping services started for active state 2012-07-15 21:48:39,686 FATAL blockmanagement.BlockManager (BlockManager.java:run(2969)) - ReplicationMonitor thread received Runtime exception. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.getBlockCollection(BlocksMap.java:98) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1097) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1072) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3000) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:2962) at java.lang.Thread.run(Thread.java:662) 2012-07-15 21:48:39,685 WARN blockmanagement.DecommissionM
          Hide
          Eli Collins added a comment -

          Here's the relevant portion of the log:

          2012-07-14 20:28:44,991 INFO namenode.FSNamesystem (FSNamesystem.java:stopActiveServices(673)) - Stopping services started for active state
          2012-07-14 20:28:44,991 WARN blockmanagement.DecommissionManager (DecommissionManager.java:run(78)) - Monitor interrupted: java.lang.InterruptedException: sleep interrupted
          2012-07-14 20:28:44,991 INFO namenode.FSNamesystem (FSNamesystem.java:stopStandbyServices(731)) - Stopping services started for standby state
          2012-07-14 20:28:44,992 WARN ipc.Server (Server.java:run(1706)) - IPC Server handler 2 on 58865, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 127.0.0.1:60713: error: java.lang.NullPointerException
          java.lang.NullPointerException
          at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.getBlockCollection(BlocksMap.java:98)
          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getBlockCollection(BlockManager.java:2893)
          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isValidBlock(FSNamesystem.java:4437)
          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:2409)
          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:470)
          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:292)
          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42602)
          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:452)
          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:396)
          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)

          Show
          Eli Collins added a comment - Here's the relevant portion of the log: 2012-07-14 20:28:44,991 INFO namenode.FSNamesystem (FSNamesystem.java:stopActiveServices(673)) - Stopping services started for active state 2012-07-14 20:28:44,991 WARN blockmanagement.DecommissionManager (DecommissionManager.java:run(78)) - Monitor interrupted: java.lang.InterruptedException: sleep interrupted 2012-07-14 20:28:44,991 INFO namenode.FSNamesystem (FSNamesystem.java:stopStandbyServices(731)) - Stopping services started for standby state 2012-07-14 20:28:44,992 WARN ipc.Server (Server.java:run(1706)) - IPC Server handler 2 on 58865, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 127.0.0.1:60713: error: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.getBlockCollection(BlocksMap.java:98) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getBlockCollection(BlockManager.java:2893) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isValidBlock(FSNamesystem.java:4437) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:2409) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:470) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:292) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42602) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:452) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)

            People

            • Assignee:
              Colin Patrick McCabe
              Reporter:
              Eli Collins
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development