Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-4867

metaSave NPEs when there are invalid blocks in repl queue.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.23.7, 2.0.4-alpha, 0.23.8
    • Fix Version/s: 2.1.0-beta, 0.23.9
    • Component/s: namenode
    • Labels:
      None

      Description

      Since metaSave cannot get the inode holding a orphaned/invalid block, it NPEs and stops generating further report. Normally ReplicationMonitor removes them quickly, but if the queue is huge, it takes very long time. Also in safe mode, they stay.

      1. HDFS-4867.branch-0.23.patch
        3 kB
        Konstantin Shvachko
      2. HDFS-4867.branch-0.23.patch
        2 kB
        Plamen Jeliazkov
      3. HDFS-4867.branch-0.23.patch
        6 kB
        Ravi Prakash
      4. HDFS-4867.branch-0.23.patch
        5 kB
        Ravi Prakash
      5. HDFS-4867.branch2.patch
        6 kB
        Ravi Prakash
      6. HDFS-4867.branch2.patch
        4 kB
        Plamen Jeliazkov
      7. HDFS-4867.branch2.patch
        4 kB
        Plamen Jeliazkov
      8. HDFS-4867.branch-2.patch
        1.0 kB
        Plamen Jeliazkov
      9. HDFS-4867.trunk.patch
        2 kB
        Plamen Jeliazkov
      10. HDFS-4867.trunk.patch
        6 kB
        Ravi Prakash
      11. HDFS-4867.trunk.patch
        5 kB
        Plamen Jeliazkov
      12. HDFS-4867.trunk.patch
        4 kB
        Plamen Jeliazkov
      13. testMetaSave.log
        6 kB
        Konstantin Shvachko

        Issue Links

          Activity

          Hide
          Plamen Jeliazkov added a comment -

          Kihwal, do you have any log snippets or stack traces by chance?

          Show
          Plamen Jeliazkov added a comment - Kihwal, do you have any log snippets or stack traces by chance?
          Hide
          Kihwal Lee added a comment -

          Sorry I forgot to post. This is from branch-0.23. branch-2/trunk uses block collection, but it may end up with the same NPE.

          java.io.IOException: java.lang.NullPointerException
          at
          org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockM
          anager.java:352)
          at
          org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.
          java:614)
          at
          org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNode
          RpcServer.java:671)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at
          sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
          57)
          at
          sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
          pl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:601) at
          org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java
          :394)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1571)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1567)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:415) at
          org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.
          java:1282)
          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1565)

          Show
          Kihwal Lee added a comment - Sorry I forgot to post. This is from branch-0.23. branch-2/trunk uses block collection, but it may end up with the same NPE. java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockM anager.java:352) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem. java:614) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNode RpcServer.java:671) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm pl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java :394) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1571) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1567) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation. java:1282) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1565)
          Hide
          Plamen Jeliazkov added a comment -

          I think I am seeing the case in branch-2; did your error look something like this?

          2013-04-25 04:27:36,826 WARN org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 10.4.106.54:46567: error: java.lang.NullPointerException
          java.lang.NullPointerException
                  at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:459)
                  at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:419)
                  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1063)
                  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1048)
                  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:785)
                  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:640)
                  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40788)
                  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
                  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
                  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735)
                  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731)
                  at java.security.AccessController.doPrivileged(Native Method)
                  at javax.security.auth.Subject.doAs(Subject.java:396)
                  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
                  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729)
          

          Also, are you able to successfully reproduce it by chance?

          Show
          Plamen Jeliazkov added a comment - I think I am seeing the case in branch-2; did your error look something like this? 2013-04-25 04:27:36,826 WARN org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.metaSave from 10.4.106.54:46567: error: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.dumpBlockMeta(BlockManager.java:459) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.metaSave(BlockManager.java:419) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1063) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.metaSave(FSNamesystem.java:1048) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.metaSave(NameNodeRpcServer.java:785) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.metaSave(ClientNamenodeProtocolServerSideTranslatorPB.java:640) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40788) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) Also, are you able to successfully reproduce it by chance?
          Hide
          Kihwal Lee added a comment -

          I think I am seeing the case in branch-2; did your error look something like this?

          Yes, that is the same bug.

          Also, are you able to successfully reproduce it by chance?

          I first saw it happening in safe mode and then during a massive decommissiong. In the former, ReplicationMonitor is not processing neededReplication queue, so these blocks are not thrown away. In the latter case, it does run but couldn't get to those blocks in time, since it limits the number of blocks it processes in one iteration.

          Detecting this condition is simple, but we need to think about what to do with it. May be it should throw them away like ReplicationMonitor would do, if running in a non-startup safemode. Outside safemode, it could just report, since ReplicationMonitor will eventually do the job.

          Show
          Kihwal Lee added a comment - I think I am seeing the case in branch-2; did your error look something like this? Yes, that is the same bug. Also, are you able to successfully reproduce it by chance? I first saw it happening in safe mode and then during a massive decommissiong. In the former, ReplicationMonitor is not processing neededReplication queue, so these blocks are not thrown away. In the latter case, it does run but couldn't get to those blocks in time, since it limits the number of blocks it processes in one iteration. Detecting this condition is simple, but we need to think about what to do with it. May be it should throw them away like ReplicationMonitor would do, if running in a non-startup safemode. Outside safemode, it could just report, since ReplicationMonitor will eventually do the job.
          Hide
          Todd Lipcon added a comment -

          It seems to me that metasave should just be a "read" operation, and not modify the queue even if it detects that it's an invalid block. I'd vote for just logging it with an "[orphaned]" or something like that.

          Show
          Todd Lipcon added a comment - It seems to me that metasave should just be a "read" operation, and not modify the queue even if it detects that it's an invalid block. I'd vote for just logging it with an " [orphaned] " or something like that.
          Hide
          Plamen Jeliazkov added a comment -

          I agree with Todd. metaSave should not modify the queue – abandoned blocks should be taken care of by the BlockManager, specifically the computeDatanodeWork method in the ReplicationMonitor. I can write up a patch to log the block as abandoned and a test for this. Ravi, have you started work already? Otherwise I'd like to take this up.

          Show
          Plamen Jeliazkov added a comment - I agree with Todd. metaSave should not modify the queue – abandoned blocks should be taken care of by the BlockManager, specifically the computeDatanodeWork method in the ReplicationMonitor. I can write up a patch to log the block as abandoned and a test for this. Ravi, have you started work already? Otherwise I'd like to take this up.
          Hide
          Plamen Jeliazkov added a comment -

          Attaching patch with unit test to print orphaned blocks from metaSave. This will fix the immediate issue but I struggle to understand WHY this is happening in the first place...

          I am able to simulate orphaned blocks in the unit test by deleting the created file immediately before metaSave is called.

          Show
          Plamen Jeliazkov added a comment - Attaching patch with unit test to print orphaned blocks from metaSave. This will fix the immediate issue but I struggle to understand WHY this is happening in the first place... I am able to simulate orphaned blocks in the unit test by deleting the created file immediately before metaSave is called.
          Hide
          Plamen Jeliazkov added a comment -

          Ravi, I am going to take this issue up. If you would like to take it back please let me know and I will back off.

          Show
          Plamen Jeliazkov added a comment - Ravi, I am going to take this issue up. If you would like to take it back please let me know and I will back off.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12585941/HDFS-4867.trunk.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4470//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4470//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585941/HDFS-4867.trunk.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4470//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4470//console This message is automatically generated.
          Hide
          Ravi Prakash added a comment -

          Hi Plamen, Please feel free to take this up.

          Show
          Ravi Prakash added a comment - Hi Plamen, Please feel free to take this up.
          Hide
          Konstantin Shvachko added a comment -

          metaSave is probably a casualty here. Should we take a look at why orphaned / missing blocks are kept in replication queues in the first place?
          It seems that when we delete a file blocks can also be removed from replication queue, because what is the point of replicating them if they don't belong to any files.

          It still makes sense to have this case covered in metaSave().
          The patch looks good. Couple of nits:

          1. Could you remove 3 unused imports in the test.
          2. Also it would be good to close BufferedReader in the end of both test cases.
          Show
          Konstantin Shvachko added a comment - metaSave is probably a casualty here. Should we take a look at why orphaned / missing blocks are kept in replication queues in the first place? It seems that when we delete a file blocks can also be removed from replication queue, because what is the point of replicating them if they don't belong to any files. It still makes sense to have this case covered in metaSave(). The patch looks good. Couple of nits: Could you remove 3 unused imports in the test. Also it would be good to close BufferedReader in the end of both test cases.
          Hide
          Plamen Jeliazkov added a comment -

          New patch with Konstantin's comments taken up.

          Show
          Plamen Jeliazkov added a comment - New patch with Konstantin's comments taken up.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12586139/HDFS-4867.trunk.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4474//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4474//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586139/HDFS-4867.trunk.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4474//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4474//console This message is automatically generated.
          Hide
          Ravi Prakash added a comment -

          Patch looks good to me. Thanks Plamen!

          metaSave is probably a casualty here. Should we take a look at why orphaned / missing blocks are kept in replication queues in the first place?
          It seems that when we delete a file blocks can also be removed from replication queue, because what is the point of replicating them if they don't belong to any files.

          +1 for Konstantin's suggestion. Plamen, could you please open another JIRA for it?

          Show
          Ravi Prakash added a comment - Patch looks good to me. Thanks Plamen! metaSave is probably a casualty here. Should we take a look at why orphaned / missing blocks are kept in replication queues in the first place? It seems that when we delete a file blocks can also be removed from replication queue, because what is the point of replicating them if they don't belong to any files. +1 for Konstantin's suggestion. Plamen, could you please open another JIRA for it?
          Hide
          Ravi Prakash added a comment -

          Plamen, could you please open another JIRA for it?

          Seems like Tao did already. Thanks Tao!

          Show
          Ravi Prakash added a comment - Plamen, could you please open another JIRA for it? Seems like Tao did already. Thanks Tao!
          Hide
          Konstantin Shvachko added a comment -

          +1 for the patch.

          Show
          Konstantin Shvachko added a comment - +1 for the patch.
          Hide
          Plamen Jeliazkov added a comment -

          The patch for trunk is applicable to branch-2.

          Show
          Plamen Jeliazkov added a comment - The patch for trunk is applicable to branch-2.
          Hide
          Ravi Prakash added a comment -

          Ported Plamen's patch to 0.23

          Show
          Ravi Prakash added a comment - Ported Plamen's patch to 0.23
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12586364/HDFS-4867.branch-0.23.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4482//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586364/HDFS-4867.branch-0.23.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4482//console This message is automatically generated.
          Hide
          Plamen Jeliazkov added a comment -

          Sorry; turns out there were differences with trunk after all. Attaching patch for branch-2.

          Show
          Plamen Jeliazkov added a comment - Sorry; turns out there were differences with trunk after all. Attaching patch for branch-2.
          Hide
          Konstantin Shvachko added a comment -

          Cancelling patch to unconfuse Jenkins.

          Show
          Konstantin Shvachko added a comment - Cancelling patch to unconfuse Jenkins.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12586365/HDFS-4867.branch2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4483//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586365/HDFS-4867.branch2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4483//console This message is automatically generated.
          Hide
          Konstantin Shvachko added a comment -

          Guys, patches for branch-2 and branch-0.23 both fail on TestMetaSave.
          Could you please take a look.

          Show
          Konstantin Shvachko added a comment - Guys, patches for branch-2 and branch-0.23 both fail on TestMetaSave. Could you please take a look.
          Hide
          Plamen Jeliazkov added a comment -

          There are subtle differences in the metaSave logs that I missed between branch-2 and trunk. Fixed up my branch-2 patch.

          Show
          Plamen Jeliazkov added a comment - There are subtle differences in the metaSave logs that I missed between branch-2 and trunk. Fixed up my branch-2 patch.
          Hide
          Ravi Prakash added a comment -

          Konstantin! I rechecked my patch and ran it several times. The test passes for me. Could you please clean your working directory and try again?

          Show
          Ravi Prakash added a comment - Konstantin! I rechecked my patch and ran it several times. The test passes for me. Could you please clean your working directory and try again?
          Hide
          Konstantin Shvachko added a comment -

          Including the log from a run on branch-0.23, where I just added a message in failing asserts, which prints the line it is seeing.
          Could be some timing issue?
          I did clean everything.

          Show
          Konstantin Shvachko added a comment - Including the log from a run on branch-0.23, where I just added a message in failing asserts, which prints the line it is seeing. Could be some timing issue? I did clean everything.
          Hide
          Konstantin Shvachko added a comment -

          OK I think it's the sequence in which the test cases are executed.
          Better to fix it as it's the difference between Java 6 and 7.

          Show
          Konstantin Shvachko added a comment - OK I think it's the sequence in which the test cases are executed. Better to fix it as it's the difference between Java 6 and 7.
          Hide
          Ravi Prakash added a comment -

          This may be because BeforeClass is initializing the cluster only once. In which case patches for trunk and 2.0 will have to be updated too. Let me check.

          Show
          Ravi Prakash added a comment - This may be because BeforeClass is initializing the cluster only once. In which case patches for trunk and 2.0 will have to be updated too. Let me check.
          Hide
          Ravi Prakash added a comment -

          The test fails irrespective of the order in which they are run when they are run from a clean hadoop-project-hdfs/hadoop-hdfs. This is true for trunk as well as 0.23

          Show
          Ravi Prakash added a comment - The test fails irrespective of the order in which they are run when they are run from a clean hadoop-project-hdfs/hadoop-hdfs. This is true for trunk as well as 0.23
          Hide
          Ravi Prakash added a comment -

          The problem was in orphaned metasave output files. I took the liberty of fixing and refactoring the tests a bit.
          Konstantin, could you please review and commit?

          Show
          Ravi Prakash added a comment - The problem was in orphaned metasave output files. I took the liberty of fixing and refactoring the tests a bit. Konstantin, could you please review and commit?
          Hide
          Ravi Prakash added a comment -

          And a delicious patch for Hadoop QA to munch

          Show
          Ravi Prakash added a comment - And a delicious patch for Hadoop QA to munch
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12586461/HDFS-4867.trunk.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4484//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4484//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586461/HDFS-4867.trunk.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4484//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4484//console This message is automatically generated.
          Hide
          Konstantin Shvachko added a comment -

          Checked the patches. They still fail, sorry. I think the problem is that orphaned blocks are still present and reported in metasave. That is if you run testMetaSaveWithOrphanedBlocks() first then testMetaSave() they fail, because the reports are different from what is expected.

          So I propose to remove testMetaSaveWithOrphanedBlocks() for this patch. Because HDFS-4878 will break it right away. So why don't we include this test to HDFS-4878 instead. Should be modified for the output there. The order of running test cases after that shouldn't matter because there will be no orphaned blocks.

          Show
          Konstantin Shvachko added a comment - Checked the patches. They still fail, sorry. I think the problem is that orphaned blocks are still present and reported in metasave. That is if you run testMetaSaveWithOrphanedBlocks() first then testMetaSave() they fail, because the reports are different from what is expected. So I propose to remove testMetaSaveWithOrphanedBlocks() for this patch. Because HDFS-4878 will break it right away. So why don't we include this test to HDFS-4878 instead. Should be modified for the output there. The order of running test cases after that shouldn't matter because there will be no orphaned blocks.
          Hide
          Ravi Prakash added a comment -

          Ok

          So I propose to remove testMetaSaveWithOrphanedBlocks() for this patch.

          Sure! That's fine by me

          Show
          Ravi Prakash added a comment - Ok So I propose to remove testMetaSaveWithOrphanedBlocks() for this patch. Sure! That's fine by me
          Hide
          Plamen Jeliazkov added a comment -

          Attaching patches for all 3 branches that remove testMetaSaveWithOrphanedBlocks and remove unused imports.

          Show
          Plamen Jeliazkov added a comment - Attaching patches for all 3 branches that remove testMetaSaveWithOrphanedBlocks and remove unused imports.
          Hide
          Ravi Prakash added a comment -

          Thanks Plamen! +1 All patches look good to me.

          Show
          Ravi Prakash added a comment - Thanks Plamen! +1 All patches look good to me.
          Hide
          Konstantin Shvachko added a comment -

          I took liberty to reorder imports for 0.23 to bring it in sync with branch 2. That may save us time in the future.

          Show
          Konstantin Shvachko added a comment - I took liberty to reorder imports for 0.23 to bring it in sync with branch 2. That may save us time in the future.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk-Commit #3875 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3875/)
          HDFS-4867. metaSave NPEs when there are invalid blocks in repl queue. Contributed by Plamen Jeliazkov and Ravi Prakash. (Revision 1490433)

          Result = SUCCESS
          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1490433
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestMetaSave.java
          Show
          Hudson added a comment - Integrated in Hadoop-trunk-Commit #3875 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3875/ ) HDFS-4867 . metaSave NPEs when there are invalid blocks in repl queue. Contributed by Plamen Jeliazkov and Ravi Prakash. (Revision 1490433) Result = SUCCESS shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1490433 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestMetaSave.java
          Hide
          Konstantin Shvachko added a comment -

          I just committed this. Thank you Plamen and Ravi.

          Show
          Konstantin Shvachko added a comment - I just committed this. Thank you Plamen and Ravi.
          Hide
          Kihwal Lee added a comment -

          Konstantin Shvachko: It looks like the change was added to the CHANGES.TXT in mapreduce, not hdfs in branch-0.23.

          Show
          Kihwal Lee added a comment - Konstantin Shvachko : It looks like the change was added to the CHANGES.TXT in mapreduce, not hdfs in branch-0.23.
          Hide
          Kihwal Lee added a comment -

          It looks like the change was added to the CHANGES.TXT in mapreduce, not hdfs in branch-0.23.

          Fixed it.

          Show
          Kihwal Lee added a comment - It looks like the change was added to the CHANGES.TXT in mapreduce, not hdfs in branch-0.23. Fixed it.
          Hide
          Konstantin Shvachko added a comment -

          Oops, sorry.

          Show
          Konstantin Shvachko added a comment - Oops, sorry.

            People

            • Assignee:
              Plamen Jeliazkov
              Reporter:
              Kihwal Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development