Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-630

In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.

    Details

    • Hadoop Flags:
      Incompatible change, Reviewed

      Description

      created from hdfs-200.

      If during a write, the dfsclient sees that a block replica location for a newly allocated block is not-connectable, it re-requests the NN to get a fresh set of replica locations of the block. It tries this dfs.client.block.write.retries times (default 3), sleeping 6 seconds between each retry ( see DFSClient.nextBlockOutputStream).

      This setting works well when you have a reasonable size cluster; if u have few datanodes in the cluster, every retry maybe pick the dead-datanode and the above logic bails out.

      Our solution: when getting block location from namenode, we give nn the excluded datanodes. The list of dead datanodes is only for one block allocation.

      1. hdfs-630-0.20-append.patch
        13 kB
        Nicolas Spiegelberg
      2. hdfs-630-0.20.txt
        14 kB
        Todd Lipcon
      3. HDFS-630.patch
        9 kB
        Cosmin Lehene
      4. HDFS-630.20-security.1.patch
        14 kB
        Jitendra Nath Pandey
      5. 0001-Fix-HDFS-630-trunk-svn-4.patch
        17 kB
        Cosmin Lehene
      6. 0001-Fix-HDFS-630-trunk-svn-3.patch
        17 kB
        Cosmin Lehene
      7. 0001-Fix-HDFS-630-trunk-svn-3.patch
        17 kB
        stack
      8. 0001-Fix-HDFS-630-trunk-svn-2.patch
        15 kB
        Cosmin Lehene
      9. 0001-Fix-HDFS-630-trunk-svn-1.patch
        15 kB
        Cosmin Lehene
      10. 0001-Fix-HDFS-630-svn.patch
        15 kB
        Cosmin Lehene
      11. 0001-Fix-HDFS-630-svn.patch
        15 kB
        Cosmin Lehene
      12. 0001-Fix-HDFS-630-for-0.21-and-trunk-unified.patch
        18 kB
        Cosmin Lehene
      13. 0001-Fix-HDFS-630-for-0.21.patch
        16 kB
        Cosmin Lehene
      14. 0001-Fix-HDFS-630-0.21-svn-2.patch
        17 kB
        Cosmin Lehene
      15. 0001-Fix-HDFS-630-0.21-svn-1.patch
        17 kB
        Cosmin Lehene
      16. 0001-Fix-HDFS-630-0.21-svn.patch
        17 kB
        Cosmin Lehene

        Issue Links

          Activity

          Hide
          Ruyue Ma added a comment -

          Ruyue Ma added a comment - 20/Jul/09 11:32 PM
          to: dhruba borthakur

          > This is not related to HDFS-4379. let me explain why.
          > The problem is actually related to HDFS-xxx. The namenode waits for 10 minutes after losing heartbeats from a datanode to declare it dead. During this 10 minutes, the NN is free to choose the dead datanode as a possible replica for a newly allocated block.

          > If during a write, the dfsclient sees that a block replica location for a newly allocated block is not-connectable, it re-requests the NN to get a fresh set of replica locations of the block. It tries this dfs.client.block.write.retries times (default 3), sleeping 6 seconds between each retry ( see DFSClient.nextBlockOutputStream). > This setting works well when you have a reasonable size cluster; if u have only 4 datanodes in the cluster, every retry picks the dead-datanode and the above logic bails out.

          > One solution is to change the value of dfs.client.block.write.retries to a much much larger value, say 200 or so. Better still, increase the number of nodes in ur cluster.

          Our modification: when getting block location from namenode, we give nn the excluded datanodes. The list of dead datanodes is only for one block allocation.

          +++ hadoop-new/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java 2009-07-20 00:19:03.000000000 +0800
          @@ -2734,6 +2734,7 @@
          LocatedBlock lb = null;
          boolean retry = false;
          DatanodeInfo[] nodes;
          + DatanodeInfo[] exludedNodes = null;
          int count = conf.getInt("dfs.client.block.write.retries", 3);
          boolean success;
          do {
          @@ -2745,7 +2746,7 @@
          success = false;

          long startTime = System.currentTimeMillis();

          • lb = locateFollowingBlock(startTime);
            + lb = locateFollowingBlock(startTime, exludedNodes);
            block = lb.getBlock();
            nodes = lb.getLocations();

          @@ -2755,6 +2756,19 @@
          success = createBlockOutputStream(nodes, clientName, false);

          if (!success) {
          +
          + LOG.info("Excluding node: " + nodes[errorIndex]);
          + // Mark datanode as excluded
          + DatanodeInfo errorNode = nodes[errorIndex];
          + if (exludedNodes != null)

          { + DatanodeInfo[] newExcludedNodes = new DatanodeInfo[exludedNodes.length + 1]; + System.arraycopy(exludedNodes, 0, newExcludedNodes, 0, exludedNodes.length); + newExcludedNodes[exludedNodes.length] = errorNode; + exludedNodes = newExcludedNodes; + } else {
          + exludedNodes = new DatanodeInfo[] { errorNode };
          + }
          +
          LOG.info("Abandoning block " + block);
          namenode.abandonBlock(block, src, clientName);
          [ Show » ]
          Ruyue Ma added a comment - 20/Jul/09 11:32 PM to: dhruba borthakur > This is not related to HDFS-4379. let me explain why. > The problem is actually related to HDFS-xxx. The namenode waits for 10 minutes after losing heartbeats from a datanode to declare it dead. During this 10 minutes, the NN is free to choose the dead datanode as a possible replica for a newly allocated block. > If during a write, the dfsclient sees that a block replica location for a newly allocated block is not-connectable, it re-requests the NN to get a fresh set of replica locations of the block. It tries this dfs.client.block.write.retries times (default 3), sleeping 6 seconds between each retry ( see DFSClient.nextBlockOutputStream). > This setting works well when you have a reasonable size cluster; if u have only 4 datanodes in the cluster, every retry picks the dead-datanode and the above logic bails out. > One solution is to change the value of dfs.client.block.write.retries to a much much larger value, say 200 or so. Better still, increase the number of nodes in ur cluster. Our modification: when getting block location from namenode, we give nn the excluded datanodes. The list of dead datanodes is only for one block allocation. +++ hadoop-new/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java 2009-07-20 00:19:03.000000000 +0800 @@ -2734,6 +2734,7 @@ LocatedBlock lb = null; boolean retry = false; DatanodeInfo[] nodes; + DatanodeInfo[] exludedNodes = null; int count = conf.getInt("dfs.client.block.write.retries", 3); boolean success; do { @@ -2745,7 +2746,7 @@ success = false; long startTime = System.currentTimeMillis();

          * lb = locateFollowingBlock(startTime); + lb = locateFollowingBlock(startTime, exludedNodes); block = lb.getBlock(); nodes = lb.getLocations();

          @@ -2755,6 +2756,19 @@ success = createBlockOutputStream(nodes, clientName, false); if (!success) { + + LOG.info("Excluding node: " + nodes[errorIndex]); + // Mark datanode as excluded + DatanodeInfo errorNode = nodes[errorIndex]; + if (exludedNodes != null) { + DatanodeInfo[] newExcludedNodes = new DatanodeInfo[exludedNodes.length + 1]; + System.arraycopy(exludedNodes, 0, newExcludedNodes, 0, exludedNodes.length); + newExcludedNodes[exludedNodes.length] = errorNode; + exludedNodes = newExcludedNodes; + }

          else { + exludedNodes = new DatanodeInfo[]

          { errorNode }

          ; + } + LOG.info("Abandoning block " + block); namenode.abandonBlock(block, src, clientName);

          [ Permlink | « Hide ]
          dhruba borthakur added a comment - 22/Jul/09 07:14 AM
          Hi Ruyue, your option of excluding specific datanodes (specified by the client) sounds reasonable. This might help in the case of network partitioning where a specific client loses access to a set of datanodes while the datanode is alive and well and is able to send heartbeats to the namenode. Can you pl create a separate JIRA for your prosposed fix and attach your patch there? Thanks.
          [ Show » ]
          dhruba borthakur added a comment - 22/Jul/09 07:14 AM Hi Ruyue, your option of excluding specific datanodes (specified by the client) sounds reasonable. This might help in the case of network partitioning where a specific client loses access to a set of datanodes while the datanode is alive and well and is able to send heartbeats to the namenode. Can you pl create a separate JIRA for your prosposed fix and attach your patch there? Thanks.

          Show
          Ruyue Ma added a comment - Ruyue Ma added a comment - 20/Jul/09 11:32 PM to: dhruba borthakur > This is not related to HDFS-4379 . let me explain why. > The problem is actually related to HDFS-xxx. The namenode waits for 10 minutes after losing heartbeats from a datanode to declare it dead. During this 10 minutes, the NN is free to choose the dead datanode as a possible replica for a newly allocated block. > If during a write, the dfsclient sees that a block replica location for a newly allocated block is not-connectable, it re-requests the NN to get a fresh set of replica locations of the block. It tries this dfs.client.block.write.retries times (default 3), sleeping 6 seconds between each retry ( see DFSClient.nextBlockOutputStream). > This setting works well when you have a reasonable size cluster; if u have only 4 datanodes in the cluster, every retry picks the dead-datanode and the above logic bails out. > One solution is to change the value of dfs.client.block.write.retries to a much much larger value, say 200 or so. Better still, increase the number of nodes in ur cluster. Our modification: when getting block location from namenode, we give nn the excluded datanodes. The list of dead datanodes is only for one block allocation. +++ hadoop-new/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java 2009-07-20 00:19:03.000000000 +0800 @@ -2734,6 +2734,7 @@ LocatedBlock lb = null; boolean retry = false; DatanodeInfo[] nodes; + DatanodeInfo[] exludedNodes = null; int count = conf.getInt("dfs.client.block.write.retries", 3); boolean success; do { @@ -2745,7 +2746,7 @@ success = false; long startTime = System.currentTimeMillis(); lb = locateFollowingBlock(startTime); + lb = locateFollowingBlock(startTime, exludedNodes); block = lb.getBlock(); nodes = lb.getLocations(); @@ -2755,6 +2756,19 @@ success = createBlockOutputStream(nodes, clientName, false); if (!success) { + + LOG.info("Excluding node: " + nodes [errorIndex] ); + // Mark datanode as excluded + DatanodeInfo errorNode = nodes [errorIndex] ; + if (exludedNodes != null) { + DatanodeInfo[] newExcludedNodes = new DatanodeInfo[exludedNodes.length + 1]; + System.arraycopy(exludedNodes, 0, newExcludedNodes, 0, exludedNodes.length); + newExcludedNodes[exludedNodes.length] = errorNode; + exludedNodes = newExcludedNodes; + } else { + exludedNodes = new DatanodeInfo[] { errorNode }; + } + LOG.info("Abandoning block " + block); namenode.abandonBlock(block, src, clientName); [ Show » ] Ruyue Ma added a comment - 20/Jul/09 11:32 PM to: dhruba borthakur > This is not related to HDFS-4379 . let me explain why. > The problem is actually related to HDFS-xxx. The namenode waits for 10 minutes after losing heartbeats from a datanode to declare it dead. During this 10 minutes, the NN is free to choose the dead datanode as a possible replica for a newly allocated block. > If during a write, the dfsclient sees that a block replica location for a newly allocated block is not-connectable, it re-requests the NN to get a fresh set of replica locations of the block. It tries this dfs.client.block.write.retries times (default 3), sleeping 6 seconds between each retry ( see DFSClient.nextBlockOutputStream). > This setting works well when you have a reasonable size cluster; if u have only 4 datanodes in the cluster, every retry picks the dead-datanode and the above logic bails out. > One solution is to change the value of dfs.client.block.write.retries to a much much larger value, say 200 or so. Better still, increase the number of nodes in ur cluster. Our modification: when getting block location from namenode, we give nn the excluded datanodes. The list of dead datanodes is only for one block allocation. +++ hadoop-new/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java 2009-07-20 00:19:03.000000000 +0800 @@ -2734,6 +2734,7 @@ LocatedBlock lb = null; boolean retry = false; DatanodeInfo[] nodes; + DatanodeInfo[] exludedNodes = null; int count = conf.getInt("dfs.client.block.write.retries", 3); boolean success; do { @@ -2745,7 +2746,7 @@ success = false; long startTime = System.currentTimeMillis(); * lb = locateFollowingBlock(startTime); + lb = locateFollowingBlock(startTime, exludedNodes); block = lb.getBlock(); nodes = lb.getLocations(); @@ -2755,6 +2756,19 @@ success = createBlockOutputStream(nodes, clientName, false); if (!success) { + + LOG.info("Excluding node: " + nodes [errorIndex] ); + // Mark datanode as excluded + DatanodeInfo errorNode = nodes [errorIndex] ; + if (exludedNodes != null) { + DatanodeInfo[] newExcludedNodes = new DatanodeInfo[exludedNodes.length + 1]; + System.arraycopy(exludedNodes, 0, newExcludedNodes, 0, exludedNodes.length); + newExcludedNodes[exludedNodes.length] = errorNode; + exludedNodes = newExcludedNodes; + } else { + exludedNodes = new DatanodeInfo[] { errorNode } ; + } + LOG.info("Abandoning block " + block); namenode.abandonBlock(block, src, clientName); [ Permlink | « Hide ] dhruba borthakur added a comment - 22/Jul/09 07:14 AM Hi Ruyue, your option of excluding specific datanodes (specified by the client) sounds reasonable. This might help in the case of network partitioning where a specific client loses access to a set of datanodes while the datanode is alive and well and is able to send heartbeats to the namenode. Can you pl create a separate JIRA for your prosposed fix and attach your patch there? Thanks. [ Show » ] dhruba borthakur added a comment - 22/Jul/09 07:14 AM Hi Ruyue, your option of excluding specific datanodes (specified by the client) sounds reasonable. This might help in the case of network partitioning where a specific client loses access to a set of datanodes while the datanode is alive and well and is able to send heartbeats to the namenode. Can you pl create a separate JIRA for your prosposed fix and attach your patch there? Thanks.
          Hide
          Cosmin Lehene added a comment -

          Patch for 0.20 branch.

          Added
          public LocatedBlock addBlock(String src, String clientName, DatanodeInfo[] excludedNodes) throws IOException;

          to ClientProtocol and implemented methods in both DFSClient and NameNode

          Added method to FSNameSystem too

          DFSClient will
          keep track of nodes that timeout when creating a new block and pass that list when retrying.

          NameNode will pass the excludedNodes list to FSNameSystem and so on.

          Fixed /src/test/org/apache/hadoop/hdfs/TestDFSClientRetries.java to reflect changes in DFSClient

          Kept the old interface as well on server side.

          We've tested on a cluster with HBase on top and it worked fine.

          Show
          Cosmin Lehene added a comment - Patch for 0.20 branch. Added public LocatedBlock addBlock(String src, String clientName, DatanodeInfo[] excludedNodes) throws IOException; to ClientProtocol and implemented methods in both DFSClient and NameNode Added method to FSNameSystem too DFSClient will keep track of nodes that timeout when creating a new block and pass that list when retrying. NameNode will pass the excludedNodes list to FSNameSystem and so on. Fixed /src/test/org/apache/hadoop/hdfs/TestDFSClientRetries.java to reflect changes in DFSClient Kept the old interface as well on server side. We've tested on a cluster with HBase on top and it worked fine.
          Hide
          dhruba borthakur added a comment -

          The code looks good. But as you might be knowing, only regression fixes go into pre-existing release branches. We can target this fix for trunk. If you can merge this path with hadoop trunk and resubmit your patch, that will be great. Also, most patch submissions require an associated junit test. You can find many existing junit tests in the src/test directory of the svn repository. Thanks.

          Show
          dhruba borthakur added a comment - The code looks good. But as you might be knowing, only regression fixes go into pre-existing release branches. We can target this fix for trunk. If you can merge this path with hadoop trunk and resubmit your patch, that will be great. Also, most patch submissions require an associated junit test. You can find many existing junit tests in the src/test directory of the svn repository. Thanks.
          Hide
          Cosmin Lehene added a comment -

          I'll try to submit the patch for trunk including unit tests. This fix is important to have HBase running correctly in case of datanode failures (http://issues.apache.org/jira/browse/HBASE-1876) so we'll probably have to maintain the patch for 0.20.x as well.

          Show
          Cosmin Lehene added a comment - I'll try to submit the patch for trunk including unit tests. This fix is important to have HBase running correctly in case of datanode failures ( http://issues.apache.org/jira/browse/HBASE-1876 ) so we'll probably have to maintain the patch for 0.20.x as well.
          Hide
          Cosmin Lehene added a comment -

          Adapted for 0.21 branch.

          Added excludedNodes back to BlockPlacementPolicy.
          Adapted to use HashMap<Node, Node> instead of List<Node> since BlockPlacementPolicyDefault was changed to use HashMap. However I'm not sure if it's supposed to be a HashMap...
          Luckily, Dhruba didn't removed the code that dealt with excludedNodes from BlockPlacementPolicyDefault so I only had to wire up the methods.

          I also added a "unit" test - it's practically a functional test that spins up a DFSMiniCluster with 3 DataNodes and kills one before creating the file.

          Show
          Cosmin Lehene added a comment - Adapted for 0.21 branch. Added excludedNodes back to BlockPlacementPolicy. Adapted to use HashMap<Node, Node> instead of List<Node> since BlockPlacementPolicyDefault was changed to use HashMap. However I'm not sure if it's supposed to be a HashMap... Luckily, Dhruba didn't removed the code that dealt with excludedNodes from BlockPlacementPolicyDefault so I only had to wire up the methods. I also added a "unit" test - it's practically a functional test that spins up a DFSMiniCluster with 3 DataNodes and kills one before creating the file.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12422242/0001-Fix-HDFS-630-for-0.21.patch
          against trunk revision 825689.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/69/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422242/0001-Fix-HDFS-630-for-0.21.patch against trunk revision 825689. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/69/console This message is automatically generated.
          Hide
          dhruba borthakur added a comment -

          Hi there, the patch failed to apply to current trunk. Can you pl merge the patch with the latest trunk and resubmit this one? Thanks a bunch.

          Show
          dhruba borthakur added a comment - Hi there, the patch failed to apply to current trunk. Can you pl merge the patch with the latest trunk and resubmit this one? Thanks a bunch.
          Hide
          stack added a comment -

          Cosmin: I applied your patch but it seems to bring on an issue where I get "java.io.IOException: Cannot complete block: block has not been COMMITTED by the client" closing a log file. See the hdfs-user mailing list. Grep for message subject: "Cannot complete block: block has not been COMMITTED by the client". Do you see this? Thanks.

          Show
          stack added a comment - Cosmin: I applied your patch but it seems to bring on an issue where I get "java.io.IOException: Cannot complete block: block has not been COMMITTED by the client" closing a log file. See the hdfs-user mailing list. Grep for message subject: "Cannot complete block: block has not been COMMITTED by the client". Do you see this? Thanks.
          Hide
          stack added a comment -

          Yeah, retried on branch-21 and the addition of HDFS-630 brings on the above COMMITTED issue. Tried the patch for 0.20 and that doesn't have this issue.

          Show
          stack added a comment - Yeah, retried on branch-21 and the addition of HDFS-630 brings on the above COMMITTED issue. Tried the patch for 0.20 and that doesn't have this issue.
          Hide
          Cosmin Lehene added a comment -

          stack: I can't reproduce it on 0.21. I did find it in the NN log before upgrading the HBase jar to the patched hdfs.

          java.io.IOException: Cannot complete block: block has not been COMMITTED by the client
          at org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction.convertToCompleteBlock(BlockInfoUnderConstruction.java:158)
          at org.apache.hadoop.hdfs.server.namenode.BlockManager.completeBlock(BlockManager.java:288)
          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1243)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:637)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:621)
          at sun.reflect.GeneratedMethodAccessor48.invoke(Unknown Source)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:516)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:964)
          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:960)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:396)
          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:958)

          I should point that
          at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:621)

          line 621 in the NameNode means it was called from an unpached DFSClient that calls the old NameNode interface
          line 621: return addBlock(src, clientName, null, null);

          This is part of public LocatedBlock addBlock(String src, String clientName, Block previous)

          @Override
          public LocatedBlock addBlock(String src, String clientName,
          Block previous)
          throws IOException

          { return addBlock(src, clientName, null, null); }

          This is different than your stacktrace http://pastie.org/695936 that calls the complete() method.

          However could you search for the same error while adding a new block with addBlock() (like mine)? If you find it, you could figure out what's the entry point in NameNode, and if it's line 621 you might have a an unpatched DFSClient.

          However, even with an unpatched DFSClient I still fail, yet, to figure out why would it cause it. Perhaps I should get a better understanding of the cause of the exception. So far, from the code comments in BlockInfoUnderConstruction I have that
          "the state of the block (the generation stamp and the length) has not been committed by the client or it does not have at least a minimal number of replicas reported from data-nodes. "

          Show
          Cosmin Lehene added a comment - stack: I can't reproduce it on 0.21. I did find it in the NN log before upgrading the HBase jar to the patched hdfs. java.io.IOException: Cannot complete block: block has not been COMMITTED by the client at org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction.convertToCompleteBlock(BlockInfoUnderConstruction.java:158) at org.apache.hadoop.hdfs.server.namenode.BlockManager.completeBlock(BlockManager.java:288) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1243) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:637) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:621) at sun.reflect.GeneratedMethodAccessor48.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:516) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:964) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:960) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:958) I should point that at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:621) line 621 in the NameNode means it was called from an unpached DFSClient that calls the old NameNode interface line 621: return addBlock(src, clientName, null, null); This is part of public LocatedBlock addBlock(String src, String clientName, Block previous) @Override public LocatedBlock addBlock(String src, String clientName, Block previous) throws IOException { return addBlock(src, clientName, null, null); } This is different than your stacktrace http://pastie.org/695936 that calls the complete() method. However could you search for the same error while adding a new block with addBlock() (like mine)? If you find it, you could figure out what's the entry point in NameNode, and if it's line 621 you might have a an unpatched DFSClient. However, even with an unpatched DFSClient I still fail, yet, to figure out why would it cause it. Perhaps I should get a better understanding of the cause of the exception. So far, from the code comments in BlockInfoUnderConstruction I have that "the state of the block (the generation stamp and the length) has not been committed by the client or it does not have at least a minimal number of replicas reported from data-nodes. "
          Hide
          stack added a comment -

          Comin: You are right. It was mismatched hadoop-hdfs jars on my end causing the problem. I don't see it anymore after ensuring all jars are patched latest around the cluster. Sorry for my wasting your time.

          Show
          stack added a comment - Comin: You are right. It was mismatched hadoop-hdfs jars on my end causing the problem. I don't see it anymore after ensuring all jars are patched latest around the cluster. Sorry for my wasting your time.
          Hide
          stack added a comment -

          I've been playing running loadings against a small hbase cluster of 4 nodes – a usual hbase initial setup – with and without this patch on the hadoop 0.21 branch. With this patch in place, the loading completes though I kill a regionserver and DN. Without it, the loading fails because more than one regionserver dies complaining that it can't allocate a block to write a flush file (NN keeps giving it the dead DN as the home for new block and never moves on).

          +1 on this patch.

          Cosmin, mind making a version for TRUNK as per Dhruba's suggestion? Thanks.

          Show
          stack added a comment - I've been playing running loadings against a small hbase cluster of 4 nodes – a usual hbase initial setup – with and without this patch on the hadoop 0.21 branch. With this patch in place, the loading completes though I kill a regionserver and DN. Without it, the loading fails because more than one regionserver dies complaining that it can't allocate a block to write a flush file (NN keeps giving it the dead DN as the home for new block and never moves on). +1 on this patch. Cosmin, mind making a version for TRUNK as per Dhruba's suggestion? Thanks.
          Hide
          Cosmin Lehene added a comment -

          The patch applies on trunk as well. However since it's a git patch I guess it caused some confusion. Here is the unified patch.

          Show
          Cosmin Lehene added a comment - The patch applies on trunk as well. However since it's a git patch I guess it caused some confusion. Here is the unified patch.
          Hide
          stack added a comment -

          Trying against hudson.

          Show
          stack added a comment - Trying against hudson.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12424983/0001-Fix-HDFS-630-for-0.21-and-trunk-unified.patch
          against trunk revision 835958.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/112/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424983/0001-Fix-HDFS-630-for-0.21-and-trunk-unified.patch against trunk revision 835958. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/112/console This message is automatically generated.
          Hide
          stack added a comment -

          @Cosmin: Make a non-git patch?

          Show
          stack added a comment - @Cosmin: Make a non-git patch?
          Hide
          dhruba borthakur added a comment -

          It would be nice to have a patch for trunk and a unit test.

          Show
          dhruba borthakur added a comment - It would be nice to have a patch for trunk and a unit test.
          Hide
          Cosmin Lehene added a comment -

          I've
          patch p1 < 0001-FixHDFS-630-for-0.21-and-trunk-unified.patch
          svn add src/test/hdfs/org/apache/hadoop/hdfs/TestDFSClientExcludedNodes.java
          svn diff > 0001-Fix-HDFS-630-svn.patch

          I really hope this works. It appears there's no easy way to generate a patch from git and have it working in this setup.

          Dhruba: if it still won't work, please run the patch with -p1 and then generate a patch that will work.
          By the way, a unit test is included with the last 3 patches.

          Show
          Cosmin Lehene added a comment - I've patch p1 < 0001-Fix HDFS-630 -for-0.21-and-trunk-unified.patch svn add src/test/hdfs/org/apache/hadoop/hdfs/TestDFSClientExcludedNodes.java svn diff > 0001-Fix- HDFS-630 -svn.patch I really hope this works. It appears there's no easy way to generate a patch from git and have it working in this setup. Dhruba: if it still won't work, please run the patch with -p1 and then generate a patch that will work. By the way, a unit test is included with the last 3 patches.
          Hide
          stack added a comment -

          Submitting new patch.

          Show
          stack added a comment - Submitting new patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12425088/0001-Fix-HDFS-630-svn.patch
          against trunk revision 880630.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 7 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 21 javac compiler warnings (more than the trunk's current 20 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/114/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/114/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/114/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/114/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425088/0001-Fix-HDFS-630-svn.patch against trunk revision 880630. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 21 javac compiler warnings (more than the trunk's current 20 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/114/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/114/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/114/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/114/console This message is automatically generated.
          Hide
          Cosmin Lehene added a comment -

          fixed old method in NameNode.addBlock
          it returned addBlock(src, clientName, null, null); instead of addBlock(src, clientName, previous, null);
          and when called it never committed previous block.

          Show
          Cosmin Lehene added a comment - fixed old method in NameNode.addBlock it returned addBlock(src, clientName, null, null); instead of addBlock(src, clientName, previous, null); and when called it never committed previous block.
          Hide
          stack added a comment -

          Cancelling so Cosmin can resubmit

          Show
          stack added a comment - Cancelling so Cosmin can resubmit
          Hide
          Cosmin Lehene added a comment -

          Fix for 0.21 and trunk.

          Show
          Cosmin Lehene added a comment - Fix for 0.21 and trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12425270/0001-Fix-HDFS-630-svn.patch
          against trunk revision 881531.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 7 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 21 javac compiler warnings (more than the trunk's current 20 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/117/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/117/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/117/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/117/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425270/0001-Fix-HDFS-630-svn.patch against trunk revision 881531. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 21 javac compiler warnings (more than the trunk's current 20 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/117/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/117/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/117/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/117/console This message is automatically generated.
          Hide
          Cosmin Lehene added a comment -

          last patch doesn't apply on trunk after the commit for HDFS-764. Here's a new patch for trunk that also fix the previous javac warning

          Show
          Cosmin Lehene added a comment - last patch doesn't apply on trunk after the commit for HDFS-764 . Here's a new patch for trunk that also fix the previous javac warning
          Hide
          Cosmin Lehene added a comment -

          submitting latest patch (for trunk only)

          Show
          Cosmin Lehene added a comment - submitting latest patch (for trunk only)
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12425328/0001-Fix-HDFS-630-trunk-svn-1.patch
          against trunk revision 881695.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 7 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/118/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/118/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/118/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/118/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425328/0001-Fix-HDFS-630-trunk-svn-1.patch against trunk revision 881695. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/118/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/118/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/118/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/118/console This message is automatically generated.
          Hide
          stack added a comment -

          I was going to commit this in a day or so unless objection (The formatting is a little odd at times in this patch but Cosmin seems to be doing his best to follow the formatting that is already in-place in the files he's patching, at least for the few I checked).

          Show
          stack added a comment - I was going to commit this in a day or so unless objection (The formatting is a little odd at times in this patch but Cosmin seems to be doing his best to follow the formatting that is already in-place in the files he's patching, at least for the few I checked).
          Hide
          Cosmin Lehene added a comment -

          I reformatted the code a little, trying to stay close to the files it changes. There's no consistent style across files however.

          Show
          Cosmin Lehene added a comment - I reformatted the code a little, trying to stay close to the files it changes. There's no consistent style across files however.
          Hide
          stack added a comment -

          New version looks fine. Retrying against hudson to be sure for sure.

          Show
          stack added a comment - New version looks fine. Retrying against hudson to be sure for sure.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12425610/0001-Fix-HDFS-630-trunk-svn-2.patch
          against trunk revision 881695.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 7 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/120/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/120/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/120/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/120/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425610/0001-Fix-HDFS-630-trunk-svn-2.patch against trunk revision 881695. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/120/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/120/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/120/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/120/console This message is automatically generated.
          Hide
          Cosmin Lehene added a comment -

          Can't see that build issue locally and can't figure out what caused it on the build server. Trying once more time

          Show
          Cosmin Lehene added a comment - Can't see that build issue locally and can't figure out what caused it on the build server. Trying once more time
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12425610/0001-Fix-HDFS-630-trunk-svn-2.patch
          against trunk revision 882733.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 7 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/122/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/122/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/122/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/122/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425610/0001-Fix-HDFS-630-trunk-svn-2.patch against trunk revision 882733. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/122/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/122/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/122/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/122/console This message is automatically generated.
          Hide
          stack added a comment -

          Committed to TRUNK. Assigned to Cosmin. He did the work. Thanks for the patch Cosmin.

          Show
          stack added a comment - Committed to TRUNK. Assigned to Cosmin. He did the work. Thanks for the patch Cosmin.
          Hide
          Konstantin Boudnik added a comment -

          I have missed this JIRA in the doing, but am going to comment on anyway. The comment is about the newly added test which is developed for JUnit v.3

          +public class TestDFSClientExcludedNodes extends TestCase {
          

          I'd like to ask all reviewers to pay attention to the fact that new tests are suppose to be written for JUnit v.4.
          Here's a short instruction on how it should be done.

          Also, the commit message has wrong JIRA number in it. It says HBASE-630 instead of HDFS-630

          Show
          Konstantin Boudnik added a comment - I have missed this JIRA in the doing, but am going to comment on anyway. The comment is about the newly added test which is developed for JUnit v.3 +public class TestDFSClientExcludedNodes extends TestCase { I'd like to ask all reviewers to pay attention to the fact that new tests are suppose to be written for JUnit v.4. Here's a short instruction on how it should be done. Also, the commit message has wrong JIRA number in it. It says HBASE-630 instead of HDFS-630
          Hide
          stack added a comment -

          Thanks Konstantin. I noticed the incorrect commit message and looked into fixing it but seems like I need to talk to svn admin so just let it slide (In CHANGES it has correct message). Would you suggest opening a new issue to change test from junit3 to junit4?

          Show
          stack added a comment - Thanks Konstantin. I noticed the incorrect commit message and looked into fixing it but seems like I need to talk to svn admin so just let it slide (In CHANGES it has correct message). Would you suggest opening a new issue to change test from junit3 to junit4?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The idea sound good. Some comments on the patch:

          • Need to update ClientProtocol.versionID since the protocol is changed.
          • DFSClient should not print LOG.info messages. Otherwise, the log messages will be printed on the shell commands like "fs -put".
          • It is better to remove the old ClientProtocol.addBlock(..) in order to keep ClientProtocol simple. Also, we should update the javadoc.
          Show
          Tsz Wo Nicholas Sze added a comment - The idea sound good. Some comments on the patch: Need to update ClientProtocol.versionID since the protocol is changed. DFSClient should not print LOG.info messages. Otherwise, the log messages will be printed on the shell commands like "fs -put". It is better to remove the old ClientProtocol.addBlock(..) in order to keep ClientProtocol simple. Also, we should update the javadoc.
          Hide
          Cosmin Lehene added a comment -

          new patch for 0.21

          removed previous addBlock method
          changed ClientProtocol version
          changed log level in DFSClient to debug for the node exclusion operation
          refactored TestDFSClientExcludedNodes to junit4

          Show
          Cosmin Lehene added a comment - new patch for 0.21 removed previous addBlock method changed ClientProtocol version changed log level in DFSClient to debug for the node exclusion operation refactored TestDFSClientExcludedNodes to junit4
          Hide
          stack added a comment -

          After chatting with Nicholas and Cosmin, was suggested that best way to proceed would be to back out 0001-Fix-HDFS-630-trunk-svn-2.patch and then run the new improved patch via hudson.

          Show
          stack added a comment - After chatting with Nicholas and Cosmin, was suggested that best way to proceed would be to back out 0001-Fix- HDFS-630 -trunk-svn-2.patch and then run the new improved patch via hudson.
          Hide
          stack added a comment -

          Reopening so can submit improved patch.

          Show
          stack added a comment - Reopening so can submit improved patch.
          Hide
          stack added a comment -

          Submitting, to hudson.

          Show
          stack added a comment - Submitting, to hudson.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1
          0001-Fix-HDFS-630-0.21-svn.patch looks good.
          Thanks, Cosmin.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 0001-Fix- HDFS-630 -0.21-svn.patch looks good. Thanks, Cosmin.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #151 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/151/)
          In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block; back out this patch so can replace w/ improved version

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #151 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/151/ ) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block; back out this patch so can replace w/ improved version
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12428216/0001-Fix-HDFS-630-0.21-svn.patch
          against trunk revision 892941.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/86/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428216/0001-Fix-HDFS-630-0.21-svn.patch against trunk revision 892941. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/86/console This message is automatically generated.
          Hide
          stack added a comment -

          Any chance of a patch that will apply to TRUNK Cosmin? The 0.21 patch does the below when applied. Thanks.

          patching file src/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
          Hunk #1 FAILED at 44.
          Hunk #2 succeeded at 192 (offset 2 lines).
          1 out of 2 hunks FAILED -- saving rejects to file src/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java.rej
          
          Show
          stack added a comment - Any chance of a patch that will apply to TRUNK Cosmin? The 0.21 patch does the below when applied. Thanks. patching file src/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java Hunk #1 FAILED at 44. Hunk #2 succeeded at 192 (offset 2 lines). 1 out of 2 hunks FAILED -- saving rejects to file src/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java.rej
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #154 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/154/)
          In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block; back out this patch so can replace w/ improved version

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #154 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/154/ ) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block; back out this patch so can replace w/ improved version
          Hide
          Cosmin Lehene added a comment -

          @stack unfortunately, no. The patch needs to be changed for trunk.

          ClientProtocol.java
          Index: src/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
          ===================================================================
          --- src/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java	(revision 891402)
          +++ src/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java	(working copy)
          @@ -44,9 +44,9 @@
              * Compared to the previous version the following changes have been introduced:
              * (Only the latest change is reflected.
              * The log of historical changes can be retrieved from the svn).
          -   * 50: change LocatedBlocks to include last block information.
          +   * 51: changed addBlock to include a list of excluded datanodes.
              */
          -  public static final long versionID = 50L;
          +  public static final long versionID = 51L;
          

          The versionID in 0.21 changes from 50L to 51L. The problem is that on trunk is already 52L so it should probably change it from 52L to 53L. This could be, however ignored on trunk and changed independently. I'm not sure what's the right approach. I could create another patch for trunk, however this would just poise versionID meaningless - It's 51L on 0.21, but on trunk 51L is something else.

          Show
          Cosmin Lehene added a comment - @stack unfortunately, no. The patch needs to be changed for trunk. ClientProtocol.java Index: src/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java =================================================================== --- src/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java (revision 891402) +++ src/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java (working copy) @@ -44,9 +44,9 @@ * Compared to the previous version the following changes have been introduced: * (Only the latest change is reflected. * The log of historical changes can be retrieved from the svn). - * 50: change LocatedBlocks to include last block information. + * 51: changed addBlock to include a list of excluded datanodes. */ - public static final long versionID = 50L; + public static final long versionID = 51L; The versionID in 0.21 changes from 50L to 51L. The problem is that on trunk is already 52L so it should probably change it from 52L to 53L. This could be, however ignored on trunk and changed independently. I'm not sure what's the right approach. I could create another patch for trunk, however this would just poise versionID meaningless - It's 51L on 0.21, but on trunk 51L is something else.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > The versionID in 0.21 changes from 50L to 51L. The problem is that on trunk is already 52L so it should probably change it from 52L to 53L. This could be, however ignored on trunk and changed independently. I'm not sure what's the right approach. ...

          We usually update versionID to max+1, max+2, etc, for each hadoop version in ascending order. In our case, we probably should update versionID in 0.21 and trunk to 53L and 54L, respectively.

          Show
          Tsz Wo Nicholas Sze added a comment - > The versionID in 0.21 changes from 50L to 51L. The problem is that on trunk is already 52L so it should probably change it from 52L to 53L. This could be, however ignored on trunk and changed independently. I'm not sure what's the right approach. ... We usually update versionID to max+1, max+2, etc, for each hadoop version in ascending order. In our case, we probably should update versionID in 0.21 and trunk to 53L and 54L, respectively.
          Hide
          dhruba borthakur added a comment -

          > update versionID in 0.21 and trunk to 53L and 54L

          +1

          Show
          dhruba borthakur added a comment - > update versionID in 0.21 and trunk to 53L and 54L +1
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #182 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/182/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #182 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/182/ )
          Hide
          Cosmin Lehene added a comment -

          New patches for 0.21 and trunk. ClientProtcol versionID is 53L for 0.21 54L for trunk.

          Show
          Cosmin Lehene added a comment - New patches for 0.21 and trunk. ClientProtcol versionID is 53L for 0.21 54L for trunk.
          Hide
          stack added a comment -

          Trunk v3 applies for me (with some small slop). Submitting to hudson.

          Show
          stack added a comment - Trunk v3 applies for me (with some small slop). Submitting to hudson.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12428982/0001-Fix-HDFS-630-0.21-svn-1.patch
          against trunk revision 893650.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/161/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428982/0001-Fix-HDFS-630-0.21-svn-1.patch against trunk revision 893650. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/161/console This message is automatically generated.
          Hide
          stack added a comment -

          Re-attach v3 trunk patch so it becomes last patch uploaded so hudson picks it up instead of the 0.21 version.

          Show
          stack added a comment - Re-attach v3 trunk patch so it becomes last patch uploaded so hudson picks it up instead of the 0.21 version.
          Hide
          stack added a comment -

          Try hudson again. Hopefully it picks up the trunk patch this time.

          Show
          stack added a comment - Try hudson again. Hopefully it picks up the trunk patch this time.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12428986/0001-Fix-HDFS-630-trunk-svn-3.patch
          against trunk revision 893650.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/162/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/162/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/162/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/162/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428986/0001-Fix-HDFS-630-trunk-svn-3.patch against trunk revision 893650. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/162/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/162/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/162/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/162/console This message is automatically generated.
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/)

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/ )
          Hide
          Todd Lipcon added a comment -

          Here's a patch against branch-0.20 that we are evaluating for inclusion in our distro, in case anyone else is interested in applying it on their own. This is a new feature and the patch contains a fair amount of hackery, so I don't intend this to be committed to Apache's branch-0.20.

          Show
          Todd Lipcon added a comment - Here's a patch against branch-0.20 that we are evaluating for inclusion in our distro, in case anyone else is interested in applying it on their own. This is a new feature and the patch contains a fair amount of hackery, so I don't intend this to be committed to Apache's branch-0.20.
          Hide
          Cosmin Lehene added a comment -

          attaching 0.21 patch with javadoc link fixed

          Show
          Cosmin Lehene added a comment - attaching 0.21 patch with javadoc link fixed
          Hide
          Cosmin Lehene added a comment -

          patch for trunk with javadoc link fixed.
          the TestFiHFlush test that failed previously seems to work fine when running tests using ant - so nothing done regarding that.

          Show
          Cosmin Lehene added a comment - patch for trunk with javadoc link fixed. the TestFiHFlush test that failed previously seems to work fine when running tests using ant - so nothing done regarding that.
          Hide
          Cosmin Lehene added a comment -

          Canceling to restart build

          Show
          Cosmin Lehene added a comment - Canceling to restart build
          Hide
          Cosmin Lehene added a comment -

          Trying the trunk patch one more time. I dont' exactly know how to trigger a 0.21 patch/build

          Show
          Cosmin Lehene added a comment - Trying the trunk patch one more time. I dont' exactly know how to trigger a 0.21 patch/build
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12430569/0001-Fix-HDFS-630-trunk-svn-4.patch
          against trunk revision 899747.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/192/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/192/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/192/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/192/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12430569/0001-Fix-HDFS-630-trunk-svn-4.patch against trunk revision 899747. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/192/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/192/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/192/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/192/console This message is automatically generated.
          Hide
          Cosmin Lehene added a comment -

          I have an "it runs on my machine" feeling. Trying once more

          Show
          Cosmin Lehene added a comment - I have an "it runs on my machine" feeling. Trying once more
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12430569/0001-Fix-HDFS-630-trunk-svn-4.patch
          against trunk revision 899747.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/193/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/193/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/193/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/193/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12430569/0001-Fix-HDFS-630-trunk-svn-4.patch against trunk revision 899747. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/193/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/193/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/193/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/193/console This message is automatically generated.
          Hide
          Cosmin Lehene added a comment -

          tests fail erratically canceling again

          Show
          Cosmin Lehene added a comment - tests fail erratically canceling again
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12430569/0001-Fix-HDFS-630-trunk-svn-4.patch
          against trunk revision 899747.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/194/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/194/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/194/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/194/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12430569/0001-Fix-HDFS-630-trunk-svn-4.patch against trunk revision 899747. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/194/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/194/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/194/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/194/console This message is automatically generated.
          Hide
          stack added a comment -

          Here is summary of Cosmin's erratic experience running his patch against Hudson where every time he ran it different tests failed: http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201001.mbox/<C779EEE1.15AE4%25clehene@adobe.com>

          I ran the Cosmin patch locally using local command against branch-0.21:

          $ ANT_HOME=/usr/bin/ant ant -Dfindbugs.home=/Users/stack/bin/findbugs-1.3.9 -Djava5.home=/System/Library/Frameworks/JavaVM.framework/Versions/1.5/Home/ -Dforrest.home=/Users/stack/bin/apache-forrest-0.8 -Dcurl.cmd=/usr/bin/curl -Dwget.cmd="/sw/bin/wget --no-check-certificate" -Dpatch.file=/tmp/0001-Fix-HDFS-630-0.21-svn-2.patch test-patch
          

          ... it outputs the below:

          ...
              [exec] There appear to be 102 release audit warnings before the patch and 102 release audit warnings after applying the patch.
               [exec] 
               [exec] 
               [exec] 
               [exec] 
               [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 13 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
               [exec] 
               [exec] 
               [exec] 
               [exec] 
               [exec] ======================================================================
               [exec] ======================================================================
               [exec]     Finished build.
               [exec] ======================================================================
               [exec] ======================================================================
          

          Let me run against TRUNK next...

          Show
          stack added a comment - Here is summary of Cosmin's erratic experience running his patch against Hudson where every time he ran it different tests failed: http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201001.mbox/ <C779EEE1.15AE4%25clehene@adobe.com> I ran the Cosmin patch locally using local command against branch-0.21: $ ANT_HOME=/usr/bin/ant ant -Dfindbugs.home=/Users/stack/bin/findbugs-1.3.9 -Djava5.home=/ System /Library/Frameworks/JavaVM.framework/Versions/1.5/Home/ -Dforrest.home=/Users/stack/bin/apache-forrest-0.8 -Dcurl.cmd=/usr/bin/curl -Dwget.cmd= "/sw/bin/wget --no-check-certificate" -Dpatch.file=/tmp/0001-Fix-HDFS-630-0.21-svn-2.patch test-patch ... it outputs the below: ... [exec] There appear to be 102 release audit warnings before the patch and 102 release audit warnings after applying the patch. [exec] [exec] [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 13 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== Let me run against TRUNK next...
          Hide
          stack added a comment -

          Here's results running test-patch of Cosmin's above trunk patch:

          $ ANT_HOME=/usr/bin/ant ant -Dfindbugs.home=/Users/stack/bin/findbugs-1.3.9 -Djava5.home=/System/Library/Frameworks/JavaVM.framework/Versions/1.5/Home/ -Dforrest.home=/Users/stack/bin/apache-forrest-0.8 -Dcurl.cmd=/usr/bin/curl -Dwget.cmd="/sw/bin/wget --no-check-certificate" -Dpatch.file=/tmp/0001-Fix-HDFS-630-trunk-svn-4.patch test-patch
          ....
          
               [exec] There appear to be 117 release audit warnings before the patch and 117 release audit warnings after applying the patch.
               [exec] 
               [exec] 
               [exec] 
               [exec] 
               [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 13 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
               [exec] 
               [exec] 
               [exec] 
               [exec] 
               [exec] ======================================================================
               [exec] ======================================================================
               [exec]     Finished build.
               [exec] ======================================================================
               [exec] ======================================================================
               [exec] 
               [exec] 
          
          BUILD SUCCESSFUL
          Total time: 10 minutes 39 seconds
          
          Show
          stack added a comment - Here's results running test-patch of Cosmin's above trunk patch: $ ANT_HOME=/usr/bin/ant ant -Dfindbugs.home=/Users/stack/bin/findbugs-1.3.9 -Djava5.home=/ System /Library/Frameworks/JavaVM.framework/Versions/1.5/Home/ -Dforrest.home=/Users/stack/bin/apache-forrest-0.8 -Dcurl.cmd=/usr/bin/curl -Dwget.cmd= "/sw/bin/wget --no-check-certificate" -Dpatch.file=/tmp/0001-Fix-HDFS-630-trunk-svn-4.patch test-patch .... [exec] There appear to be 117 release audit warnings before the patch and 117 release audit warnings after applying the patch. [exec] [exec] [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 13 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] BUILD SUCCESSFUL Total time: 10 minutes 39 seconds
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I agree that the test failure in the previous Hudson run is not related since there were some classes not found.

          java.lang.NoClassDefFoundError: org/apache/hadoop/ipc/Server$Handler
          	at org.apache.hadoop.ipc.Server.start(Server.java:1112)
          	...
          
          Show
          Tsz Wo Nicholas Sze added a comment - I agree that the test failure in the previous Hudson run is not related since there were some classes not found. java.lang.NoClassDefFoundError: org/apache/hadoop/ipc/Server$Handler at org.apache.hadoop.ipc.Server.start(Server.java:1112) ...
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Let's try it again.

          Show
          Tsz Wo Nicholas Sze added a comment - Let's try it again.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12430569/0001-Fix-HDFS-630-trunk-svn-4.patch
          against trunk revision 901316.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/197/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/197/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/197/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/197/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12430569/0001-Fix-HDFS-630-trunk-svn-4.patch against trunk revision 901316. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/197/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/197/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/197/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/197/console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Got NoClassDefFoundError again. I bet Hudson has some problems.

          java.lang.NoClassDefFoundError: org/apache/hadoop/ipc/Server$Handler
          	at org.apache.hadoop.ipc.Server.start(Server.java:1112)
          	...
          

          I ran all tests in my machine. It passed all the tests.

          Show
          Tsz Wo Nicholas Sze added a comment - Got NoClassDefFoundError again. I bet Hudson has some problems. java.lang.NoClassDefFoundError: org/apache/hadoop/ipc/Server$Handler at org.apache.hadoop.ipc.Server.start(Server.java:1112) ... I ran all tests in my machine. It passed all the tests.
          Hide
          stack added a comment -

          Ran another vote up on hdfs-dev as to whether to apply Cosmin's latest to 0.21 branch. Vote passed with 14 +1s and no -1s. See the thread here: http://www.mail-archive.com/hbase-dev@hadoop.apache.org/msg16804.html

          Show
          stack added a comment - Ran another vote up on hdfs-dev as to whether to apply Cosmin's latest to 0.21 branch. Vote passed with 14 +1s and no -1s. See the thread here: http://www.mail-archive.com/hbase-dev@hadoop.apache.org/msg16804.html
          Hide
          stack added a comment -

          Applied 0001-Fix-HDFS-6300.21-svn-2.patch to branch-21 and 0001-FixHDFS-630-trunk-svn-4.patch to TRUNK. Thanks for the patch Cosmin Lehene.

          Show
          stack added a comment - Applied 0001-Fix- HDFS-630 0.21-svn-2.patch to branch-21 and 0001-Fix HDFS-630 -trunk-svn-4.patch to TRUNK. Thanks for the patch Cosmin Lehene.
          Hide
          stack added a comment -

          Resolving.

          Show
          stack added a comment - Resolving.
          Hide
          Cosmin Lehene added a comment -

          I'm glad it finally got in both 0.21 and trunk. It was a long lived issue. Thanks for the support!

          Show
          Cosmin Lehene added a comment - I'm glad it finally got in both 0.21 and trunk. It was a long lived issue. Thanks for the support!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #178 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/178/)
          In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block
          In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #178 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/178/ ) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #212 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/212/)
          In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block
          In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #212 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/212/ ) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block
          Hide
          Hudson added a comment -

          Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #208 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/208/)

          Show
          Hudson added a comment - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #208 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/208/ )
          Hide
          stack added a comment -

          This should be pulled into the branch-0.20-append branch.

          Show
          stack added a comment - This should be pulled into the branch-0.20-append branch.
          Hide
          Cosmin Lehene added a comment -

          There's a patch for 0.20 adapted by tlipcon. Can we use that?

          Show
          Cosmin Lehene added a comment - There's a patch for 0.20 adapted by tlipcon. Can we use that?
          Hide
          Nicolas Spiegelberg added a comment -

          Version of this patch for 0.20-append branch. Removed dependency on junit 4.5

          Show
          Nicolas Spiegelberg added a comment - Version of this patch for 0.20-append branch. Removed dependency on junit 4.5
          Hide
          Jitendra Nath Pandey added a comment -

          Patch for 20-security branch uploaded.

          Show
          Jitendra Nath Pandey added a comment - Patch for 20-security branch uploaded.
          Hide
          Suresh Srinivas added a comment -

          +1 for the patch.

          Show
          Suresh Srinivas added a comment - +1 for the patch.
          Hide
          Suresh Srinivas added a comment -

          I committed the patch to 0.20-security branch.

          Show
          Suresh Srinivas added a comment - I committed the patch to 0.20-security branch.

            People

            • Assignee:
              Cosmin Lehene
              Reporter:
              Ruyue Ma
            • Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development