Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9083

Replication violates block placement policy.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.7.2, 2.6.3
    • Component/s: namenode
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Recently we are noticing many cases in which all the replica of the block are residing on the same rack.
      During the block creation, the block placement policy was honored.
      But after node failure event in some specific manner, the block ends up in such state.

      On investigating more I found out that BlockManager#blockHasEnoughRacks is dependent on the config (net.topology.script.file.name)

       if (!this.shouldCheckForEnoughRacks) {
            return true;
          }
      

      We specify DNSToSwitchMapping implementation (our own custom implementation) via net.topology.node.switch.mapping.impl and no longer use net.topology.script.file.name config.

      1. HDFS-9083-Test fix-branch-2.7.patch
        2 kB
        Brahma Reddy Battula
      2. HDFS-9083-branch-2.6.patch
        4 kB
        Rushabh S Shah
      3. HDFS-9083-branch-2.7.patch
        4 kB
        Rushabh S Shah

        Activity

        Hide
        djp Junping Du added a comment -

        Got it. Thanks for the confirmation!

        Show
        djp Junping Du added a comment - Got it. Thanks for the confirmation!
        Hide
        kihwal Kihwal Lee added a comment -

        Junping Du, that's correct.

        Show
        kihwal Kihwal Lee added a comment - Junping Du , that's correct.
        Hide
        djp Junping Du added a comment -

        Hi Rushabh S Shah, Kihwal Lee and Sangjin Lee, I assume the issue fixed here is only applied to 2.7/2.6 but not affect 2.8.0 and 3.0.0. Can you confirm it? Thanks!

        Show
        djp Junping Du added a comment - Hi Rushabh S Shah , Kihwal Lee and Sangjin Lee , I assume the issue fixed here is only applied to 2.7/2.6 but not affect 2.8.0 and 3.0.0. Can you confirm it? Thanks!
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        HDFS-9501 Raised for this testcase fix.

        Show
        brahmareddy Brahma Reddy Battula added a comment - HDFS-9501 Raised for this testcase fix.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Xiaoyu Yao Uploaded addendum patch to fix testcase failure.. It's ok for me,If you want it done in separate jira,

        Show
        brahmareddy Brahma Reddy Battula added a comment - Xiaoyu Yao Uploaded addendum patch to fix testcase failure.. It's ok for me,If you want it done in separate jira,
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Xiaoyu Yao Yes, It's not handled in branch-2.7..

        Show
        brahmareddy Brahma Reddy Battula added a comment - Xiaoyu Yao Yes, It's not handled in branch-2.7..
        Hide
        xyao Xiaoyu Yao added a comment -

        Thanks Brahma Reddy Battula for the explanation. That helps to understand the issue.
        Is this fixed in 2.7.x branches such as branch-2.7.1 or branch-2.7.2? If not, we need a separate ticket for the unit test fix.

        Show
        xyao Xiaoyu Yao added a comment - Thanks Brahma Reddy Battula for the explanation. That helps to understand the issue. Is this fixed in 2.7.x branches such as branch-2.7.1 or branch-2.7.2? If not, we need a separate ticket for the unit test fix.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Xiaoyu Yao thanks for pointing same..

        cluster = new MiniDFSCluster.Builder(conf).numDataNodes(capacities.length)
          .hosts(new String[]{"localhost", "localhost"})
          .racks(new String[]{"rack0", "rack1"}).simulatedCapacities(capacities).build()
        

        2 DNs are started with "rack1". Ideally we should not create 2 DNs with the same hostname.And Pinning depends on favoredNodes.DFSClient#create(..) only uses host:port, if favoredNodes is created by new InetSocketAddress(ip, port)

        DFSClient will attempt a reverse lookup locally to get host:port, instead of sending ip:port directly to NameNode.
        .
        MiniDFSCluster use fake hostname "host1.foo.com" to start DataNodes.DFSClient doesn't use StaticMapping. So if DFSClient do reverse lookup, "127.0.0.1:8020" becomes "localhost:8020".

        Fix can be like following which I did same in branch-2 and Trunk.

        +    String[] hosts = {"host0", "host1"};
             String[] racks = { RACK0, RACK1 };
             int numOfDatanodes = capacities.length;
         
             cluster = new MiniDFSCluster.Builder(conf).numDataNodes(capacities.length)
        -      .hosts(new String[]{"localhost", "localhost"})
        -      .racks(racks).simulatedCapacities(capacities).build();
        +        .hosts(hosts).racks(racks).simulatedCapacities(capacities).build();
         
             try {
               cluster.waitActive();
        @@ -377,7 +377,10 @@ public void testBalancerWithPinnedBlocks() throws Exception {
               long totalUsedSpace = totalCapacity * 8 / 10;
               InetSocketAddress[] favoredNodes = new InetSocketAddress[numOfDatanodes];
               for (int i = 0; i < favoredNodes.length; i++) {
        -        favoredNodes[i] = cluster.getDataNodes().get(i).getXferAddress();
        +        // DFSClient will attempt reverse lookup. In case it resolves
        +        // "127.0.0.1" to "localhost", we manually specify the hostname.
        +        int port = cluster.getDataNodes().get(i).getXferAddress().getPort();
        +        favoredNodes[i] = new InetSocketAddress(hosts[i], port);
        
        Show
        brahmareddy Brahma Reddy Battula added a comment - Xiaoyu Yao thanks for pointing same.. cluster = new MiniDFSCluster.Builder(conf).numDataNodes(capacities.length) .hosts( new String []{ "localhost" , "localhost" }) .racks( new String []{ "rack0" , "rack1" }).simulatedCapacities(capacities).build() 2 DNs are started with "rack1". Ideally we should not create 2 DNs with the same hostname.And Pinning depends on favoredNodes.DFSClient#create(..) only uses host:port, if favoredNodes is created by new InetSocketAddress(ip, port) DFSClient will attempt a reverse lookup locally to get host:port, instead of sending ip:port directly to NameNode. . MiniDFSCluster use fake hostname "host1.foo.com" to start DataNodes.DFSClient doesn't use StaticMapping. So if DFSClient do reverse lookup, "127.0.0.1:8020" becomes "localhost:8020". Fix can be like following which I did same in branch-2 and Trunk. + String [] hosts = { "host0" , "host1" }; String [] racks = { RACK0, RACK1 }; int numOfDatanodes = capacities.length; cluster = new MiniDFSCluster.Builder(conf).numDataNodes(capacities.length) - .hosts( new String []{ "localhost" , "localhost" }) - .racks(racks).simulatedCapacities(capacities).build(); + .hosts(hosts).racks(racks).simulatedCapacities(capacities).build(); try { cluster.waitActive(); @@ -377,7 +377,10 @@ public void testBalancerWithPinnedBlocks() throws Exception { long totalUsedSpace = totalCapacity * 8 / 10; InetSocketAddress[] favoredNodes = new InetSocketAddress[numOfDatanodes]; for ( int i = 0; i < favoredNodes.length; i++) { - favoredNodes[i] = cluster.getDataNodes().get(i).getXferAddress(); + // DFSClient will attempt reverse lookup. In case it resolves + // "127.0.0.1" to "localhost" , we manually specify the hostname. + int port = cluster.getDataNodes().get(i).getXferAddress().getPort(); + favoredNodes[i] = new InetSocketAddress(hosts[i], port);
        Hide
        xyao Xiaoyu Yao added a comment -

        The 2.7 patch caused failure of TestBalancer#testBalancerWithPinnedBlocks. The test was passing without this patch.
        Rushabh S Shah, can you take a look?

        Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0
        Running org.apache.hadoop.hdfs.server.balancer.TestBalancer
        Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.888 sec <<< FAILURE! - in org.apache.hadoop.hdfs.server.balancer.TestBalancer
        testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer) Time elapsed: 12.748 sec <<< FAILURE!
        java.lang.AssertionError: expected:<-3> but was:<0>
        at org.junit.Assert.fail(Assert.java:88)
        at org.junit.Assert.failNotEquals(Assert.java:743)
        at org.junit.Assert.assertEquals(Assert.java:118)
        at org.junit.Assert.assertEquals(Assert.java:555)
        at org.junit.Assert.assertEquals(Assert.java:542)
        at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:362)

        Results :

        Failed tests:
        TestBalancer.testBalancerWithPinnedBlocks:362 expected:<-3> but was:<0>

        Show
        xyao Xiaoyu Yao added a comment - The 2.7 patch caused failure of TestBalancer#testBalancerWithPinnedBlocks. The test was passing without this patch. Rushabh S Shah , can you take a look? Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.balancer.TestBalancer Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.888 sec <<< FAILURE! - in org.apache.hadoop.hdfs.server.balancer.TestBalancer testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer) Time elapsed: 12.748 sec <<< FAILURE! java.lang.AssertionError: expected:<-3> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:362) Results : Failed tests: TestBalancer.testBalancerWithPinnedBlocks:362 expected:<-3> but was:<0>
        Hide
        sjlee0 Sangjin Lee added a comment -

        +1. Committed it to branch-2.6. Thanks Rushabh S Shah!

        Show
        sjlee0 Sangjin Lee added a comment - +1. Committed it to branch-2.6. Thanks Rushabh S Shah !
        Hide
        shahrs87 Rushabh S Shah added a comment -

        Hi Sangjin Lee: PFA the patch for branch-2.6.

        Show
        shahrs87 Rushabh S Shah added a comment - Hi Sangjin Lee : PFA the patch for branch-2.6.
        Hide
        shahrs87 Rushabh S Shah added a comment -

        Attaching patch for branch-2.6

        Show
        shahrs87 Rushabh S Shah added a comment - Attaching patch for branch-2.6
        Hide
        sjlee0 Sangjin Lee added a comment -

        Hi Rushabh S Shah, could you please create a patch for branch-2.6? I don't think the changes in 2.7.x apply cleanly to 2.6.x.

        Show
        sjlee0 Sangjin Lee added a comment - Hi Rushabh S Shah , could you please create a patch for branch-2.6? I don't think the changes in 2.7.x apply cleanly to 2.6.x.
        Hide
        shahrs87 Rushabh S Shah added a comment -

        Yes. This bug is there in 2.6 also.

        Show
        shahrs87 Rushabh S Shah added a comment - Yes. This bug is there in 2.6 also.
        Hide
        sjlee0 Sangjin Lee added a comment -

        Should this be backported to branch-2.6?

        Show
        sjlee0 Sangjin Lee added a comment - Should this be backported to branch-2.6?
        Hide
        shahrs87 Rushabh S Shah added a comment -

        Kihwal Lee: Thanks for committing.

        Show
        shahrs87 Rushabh S Shah added a comment - Kihwal Lee : Thanks for committing.
        Hide
        kihwal Kihwal Lee added a comment -

        Thanks Jing Zhao Ming Ma and Brahma Reddy Battula for reviews and thanks for reporting, analyzing and fixing the issue, Rushabh S Shah. I've committed this to branch-2.7.

        Show
        kihwal Kihwal Lee added a comment - Thanks Jing Zhao Ming Ma and Brahma Reddy Battula for reviews and thanks for reporting, analyzing and fixing the issue, Rushabh S Shah . I've committed this to branch-2.7.
        Hide
        kihwal Kihwal Lee added a comment -

        +1

        Show
        kihwal Kihwal Lee added a comment - +1
        Hide
        shahrs87 Rushabh S Shah added a comment -

        Ran the findbugs on hadoop-hdfs-project and there is one findbugs warning in PBImageTextWriter.java.
        I haven't made any changes in that file.

        Show
        shahrs87 Rushabh S Shah added a comment - Ran the findbugs on hadoop-hdfs-project and there is one findbugs warning in PBImageTextWriter.java. I haven't made any changes in that file.
        Hide
        shahrs87 Rushabh S Shah added a comment -

        Jing Zhao Ming Ma Brahma Reddy Battula: Thanks for the reviews.
        I ran all the hdfs tests since jenkins failed to run the tests.
        The following tests failed:

        TestSecureNNWithQJM#testSecureMode
        TestSecureNNWithQJM#testSecondaryNameNodeHttpAddressNotNeeded
        TestAppendSnapshotTruncate#testAST
        TestBalancer#testTwoReplicaShouldNotInSameDN
        TestBalancer#testBalancerWithPinnedBlocks
        TestBalancer#testBalancerWithZeroThreadsForMove
        TestBalancerWithSaslDataTransfer#testBalancer0Integrity
        TestBalancerWithSaslDataTransfer#testBalancer0Authentication
        TestBalancerWithSaslDataTransfer#testBalancer0Privacy
        TestBalancerWithNodeGroup#testBalancerWithNodeGroup
        TestBalancerWithNodeGroup#testBalancerEndInNoMoveProgress
        TestSaslDataTransfer#testServerSaslNoClientSasl
        TestSaslDataTransfer#testClientAndServerDoNotHaveCommonQop
        TestSaslDataTransfer#testAuthentication
        TestSaslDataTransfer#testPrivacy
        TestSaslDataTransfer#testNoSaslAndSecurePortsIgnored
        TestSaslDataTransfer#testIntegrity
        

        I ran all these tests multiple times.
        All these tests failed always except TestAppendSnapshotTruncate#testAST, which failed intermittently.

        I ran all the failed tests without my patch also and they failed.
        So none of the test failures are related to my patch.
        I will start the test-patch.sh on my machine and upload the results shortly.

        Show
        shahrs87 Rushabh S Shah added a comment - Jing Zhao Ming Ma Brahma Reddy Battula : Thanks for the reviews. I ran all the hdfs tests since jenkins failed to run the tests. The following tests failed: TestSecureNNWithQJM#testSecureMode TestSecureNNWithQJM#testSecondaryNameNodeHttpAddressNotNeeded TestAppendSnapshotTruncate#testAST TestBalancer#testTwoReplicaShouldNotInSameDN TestBalancer#testBalancerWithPinnedBlocks TestBalancer#testBalancerWithZeroThreadsForMove TestBalancerWithSaslDataTransfer#testBalancer0Integrity TestBalancerWithSaslDataTransfer#testBalancer0Authentication TestBalancerWithSaslDataTransfer#testBalancer0Privacy TestBalancerWithNodeGroup#testBalancerWithNodeGroup TestBalancerWithNodeGroup#testBalancerEndInNoMoveProgress TestSaslDataTransfer#testServerSaslNoClientSasl TestSaslDataTransfer#testClientAndServerDoNotHaveCommonQop TestSaslDataTransfer#testAuthentication TestSaslDataTransfer#testPrivacy TestSaslDataTransfer#testNoSaslAndSecurePortsIgnored TestSaslDataTransfer#testIntegrity I ran all these tests multiple times. All these tests failed always except TestAppendSnapshotTruncate#testAST, which failed intermittently. I ran all the failed tests without my patch also and they failed. So none of the test failures are related to my patch. I will start the test-patch.sh on my machine and upload the results shortly.
        Hide
        brahmareddy Brahma Reddy Battula added a comment -

        Jing Zhao thanks for pinging. Patch LGTM.Rushabh S Shah As jenkins did not run,can you confirm about testcase failures in branc-2.7..?

        Show
        brahmareddy Brahma Reddy Battula added a comment - Jing Zhao thanks for pinging. Patch LGTM. Rushabh S Shah As jenkins did not run,can you confirm about testcase failures in branc-2.7..?
        Hide
        mingma Ming Ma added a comment -

        LGTM. Rushabh S Shah any test failures in branch 2.7(if any, it should be just test code update)?

        Show
        mingma Ming Ma added a comment - LGTM. Rushabh S Shah any test failures in branch 2.7(if any, it should be just test code update)?
        Hide
        jingzhao Jing Zhao added a comment -

        The patch looks good to me. Ming Ma and Brahma Reddy Battula, do you also want to take a look at the patch since you guys worked on HDFS-8647.

        Show
        jingzhao Jing Zhao added a comment - The patch looks good to me. Ming Ma and Brahma Reddy Battula , do you also want to take a look at the patch since you guys worked on HDFS-8647 .
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        -1 patch 0m 0s The patch command could not apply the patch during dryrun.



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12768797/HDFS-9083-branch-2.7.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision branch-2 / baa2998
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13198/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 patch 0m 0s The patch command could not apply the patch during dryrun. Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12768797/HDFS-9083-branch-2.7.patch Optional Tests javadoc javac unit findbugs checkstyle git revision branch-2 / baa2998 Console output https://builds.apache.org/job/PreCommit-HDFS-Build/13198/console This message was automatically generated.
        Hide
        jingzhao Jing Zhao added a comment -

        Thanks for working on this, Rushabh Shah. Yes, we need to fix this in branch-2.7 and currently this is a blocker for 2.7.2.

        Show
        jingzhao Jing Zhao added a comment - Thanks for working on this, Rushabh Shah . Yes, we need to fix this in branch-2.7 and currently this is a blocker for 2.7.2.
        Show
        shahrs87 Rushabh S Shah added a comment - This is a conversation from https://issues.apache.org/jira/browse/HDFS-8647?focusedCommentId=14956537&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14956537
        Hide
        shahrs87 Rushabh S Shah added a comment -

        Most of the change looks good. It seems this will also fix HDFS-9083. cc: Rushabh S Shah.

        Ming Ma: Thanks for letting me know. Do we need to fix it in branch-2.7 ?

        Show
        shahrs87 Rushabh S Shah added a comment - Most of the change looks good. It seems this will also fix HDFS-9083 . cc: Rushabh S Shah. Ming Ma : Thanks for letting me know. Do we need to fix it in branch-2.7 ?
        Hide
        jagadesh.kiran Jagadesh Kiran N added a comment -

        Rushabh S Shah no problem , please assign

        Show
        jagadesh.kiran Jagadesh Kiran N added a comment - Rushabh S Shah no problem , please assign
        Hide
        shahrs87 Rushabh S Shah added a comment -

        Jagadesh Kiran N: Do you mind if I assign the jira to myself.
        I have started working on the patch.
        I missed to assign to myself when I created the jira.

        Show
        shahrs87 Rushabh S Shah added a comment - Jagadesh Kiran N : Do you mind if I assign the jira to myself. I have started working on the patch. I missed to assign to myself when I created the jira.

          People

          • Assignee:
            shahrs87 Rushabh S Shah
            Reporter:
            shahrs87 Rushabh S Shah
          • Votes:
            0 Vote for this issue
            Watchers:
            17 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development