Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1256

Dfs image loading and edits loading creates multiple instances of DatanodeDescriptor for the same datanode

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.12.3
    • 0.13.0
    • None
    • None

    Description

      This leads to multiple instances of DatanodeDescriptors for the same datanode stored in Host2DatanodeMap.

      Attachments

        1. nodeMap.patch
          1 kB
          Hairong Kuang

        Issue Links

          Activity

            hairong Hairong Kuang added a comment -

            This patch makes sure that host2DatanodeMap is consistent with datanodeMap and thus eliminates the possibility that host2DatanodeMap has more than one instance of the datanode.

            hairong Hairong Kuang added a comment - This patch makes sure that host2DatanodeMap is consistent with datanodeMap and thus eliminates the possibility that host2DatanodeMap has more than one instance of the datanode.

            In my test the image file contains a data-node D0 = <name0, storageID>.
            And the edits file has two record [remove D0], [add D1], where D1 = <name1, storageID>.
            storageID is the same meaning that the I'm starting the same data-node on different ip addresses/ports.

            I start the name-node, and I get an empty edits file and the image containing D1, which means that the
            edits have been applied correctly, everything is as expected.

            Then I start data-node D0 and see 2 problems that I believe are related to this issue.
            1. The edits file contains 5 add/remove records in it.
            There should be just 2: [remove D1], [add D0]
            2. The first record in the edits file is [remove D0].
            And if I try to restart the name-node it throws UnregisteredDatanodeException exception:

            07/04/12 17:39:40 ERROR dfs.NameNode: org.apache.hadoop.dfs.UnregisteredDatanodeException: Data node <name0> is attempting to report storage ID DS1537505994. Node <name0> is expected to serve this storage.
            at org.apache.hadoop.dfs.FSNamesystem.getDatanode(FSNamesystem.java:3461)
            at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:311)
            at org.apache.hadoop.dfs.FSImage.loadFSEdits(FSImage.java:672)
            at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:585)
            at org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:220)
            at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:346)
            at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:251)
            at org.apache.hadoop.dfs.NameNode.init(NameNode.java:173)
            at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:211)
            at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:820)
            at org.apache.hadoop.dfs.NameNode.main(NameNode.java:828)

            I tried the patch it did not fix this problem.

            shv Konstantin Shvachko added a comment - In my test the image file contains a data-node D0 = <name0, storageID>. And the edits file has two record [remove D0] , [add D1] , where D1 = <name1, storageID>. storageID is the same meaning that the I'm starting the same data-node on different ip addresses/ports. I start the name-node, and I get an empty edits file and the image containing D1, which means that the edits have been applied correctly, everything is as expected. Then I start data-node D0 and see 2 problems that I believe are related to this issue. 1. The edits file contains 5 add/remove records in it. There should be just 2: [remove D1] , [add D0] 2. The first record in the edits file is [remove D0] . And if I try to restart the name-node it throws UnregisteredDatanodeException exception: 07/04/12 17:39:40 ERROR dfs.NameNode: org.apache.hadoop.dfs.UnregisteredDatanodeException: Data node <name0> is attempting to report storage ID DS1537505994. Node <name0> is expected to serve this storage. at org.apache.hadoop.dfs.FSNamesystem.getDatanode(FSNamesystem.java:3461) at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:311) at org.apache.hadoop.dfs.FSImage.loadFSEdits(FSImage.java:672) at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:585) at org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:220) at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:346) at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:251) at org.apache.hadoop.dfs.NameNode.init(NameNode.java:173) at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:211) at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:820) at org.apache.hadoop.dfs.NameNode.main(NameNode.java:828) I tried the patch it did not fix this problem.
            hairong Hairong Kuang added a comment -

            This is the cause of HADOOP-1254.

            hairong Hairong Kuang added a comment - This is the cause of HADOOP-1254 .

            +1
            the new patch solved the problem.

            shv Konstantin Shvachko added a comment - +1 the new patch solved the problem.
            hadoopqa Hadoop QA added a comment - +1 http://issues.apache.org/jira/secure/attachment/12355474/nodeMap.patch applied and successfully tested against trunk revision r528230. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/41/testReport/ Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/41/console

            General comment:

            We should treat differently optimization patches.
            It would be good to have some measurements that prove the optimization really
            works and worth the complexity involved.

            These bugs are related to the optimization issue HADOOP-971, which introduced
            a new data-structure in the name-node in order to accelerate access to data-nodes by name.
            It involves more complexity in synchronizing the new map with the other structures.
            We still don't know what the benefits are.

            We have TestDFSIO and NNBench to measure the performance.
            In this particular case the cluster should have a lot of data-nodes, so it probably needs
            a custom benchmark, which should run imo both on a small cluster to show the existing
            performance does not degrade and on a large one to show the advantages.

            I'd propose to make it a requirement for committing optimization patches.

            shv Konstantin Shvachko added a comment - General comment: We should treat differently optimization patches. It would be good to have some measurements that prove the optimization really works and worth the complexity involved. These bugs are related to the optimization issue HADOOP-971 , which introduced a new data-structure in the name-node in order to accelerate access to data-nodes by name. It involves more complexity in synchronizing the new map with the other structures. We still don't know what the benefits are. We have TestDFSIO and NNBench to measure the performance. In this particular case the cluster should have a lot of data-nodes, so it probably needs a custom benchmark, which should run imo both on a small cluster to show the existing performance does not degrade and on a large one to show the advantages. I'd propose to make it a requirement for committing optimization patches.
            rangadi Raghu Angadi added a comment -

            Just adding to the discussion :

            > These bugs are related to the optimization issue HADOOP-971, which introduced
            > a new data-structure in the name-node in order to accelerate access to data-nodes by name.
            > It involves more complexity in synchronizing the new map with the other structures.
            > We still don't know what the benefits are.

            Would this bug be any easier to find or fix if this was introduced as part of an feature improvement? Or do you think optimization patches tend to get less rigorously tested and reviewed?

            In some cases, performance improvements are really obvious even just by looking at the code. I do not want to take credit for any part of HADOOP-971, but looks like it was tested to show improvement.

            rangadi Raghu Angadi added a comment - Just adding to the discussion : > These bugs are related to the optimization issue HADOOP-971 , which introduced > a new data-structure in the name-node in order to accelerate access to data-nodes by name. > It involves more complexity in synchronizing the new map with the other structures. > We still don't know what the benefits are. Would this bug be any easier to find or fix if this was introduced as part of an feature improvement? Or do you think optimization patches tend to get less rigorously tested and reviewed? In some cases, performance improvements are really obvious even just by looking at the code. I do not want to take credit for any part of HADOOP-971 , but looks like it was tested to show improvement.
            cutting Doug Cutting added a comment -

            The benefits of new features should be evaluated differently than those of performance optimizations. Both should be weighed against their added complexity, but the improvements offered by optimizations can and should be quantitatively measured before they're committed. Optimizing things by eye is known to be error-prone. Evaluating features is more subjective.

            So, +1 for requiring benchmark results for optimizations.

            cutting Doug Cutting added a comment - The benefits of new features should be evaluated differently than those of performance optimizations. Both should be weighed against their added complexity, but the improvements offered by optimizations can and should be quantitatively measured before they're committed. Optimizing things by eye is known to be error-prone. Evaluating features is more subjective. So, +1 for requiring benchmark results for optimizations.
            nidaley Nigel Daley added a comment -

            Some code comments and a unit test would be good.

            nidaley Nigel Daley added a comment - Some code comments and a unit test would be good.
            hairong Hairong Kuang added a comment -

            As I discussed with Nigel and Konstantin, a deterministic junit test is hard to be created. So I submitted the patch with comments but without a unit test.

            hairong Hairong Kuang added a comment - As I discussed with Nigel and Konstantin, a deterministic junit test is hard to be created. So I submitted the patch with comments but without a unit test.
            hadoopqa Hadoop QA added a comment - +1 http://issues.apache.org/jira/secure/attachment/12355529/nodeMap.patch applied and successfully tested against trunk revision r528230. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/49/testReport/ Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/49/console
            hairong Hairong Kuang added a comment -

            I want to make it clear that HADOOP-971 patch was tested on a 1800-node cluster before it got committed. But this bug occurs only when fsimage & fsedits contain multiple add entries for the same storage id. So unfortunately the bug was not caught.

            I understand Konstantin's concern that HADOOP-971 adds complexity to NameNode. I'd be happy if anybody comes up with an idea that removes the getDatanodeByHost bottleneck without introducing the host2DatanodeMap.

            hairong Hairong Kuang added a comment - I want to make it clear that HADOOP-971 patch was tested on a 1800-node cluster before it got committed. But this bug occurs only when fsimage & fsedits contain multiple add entries for the same storage id. So unfortunately the bug was not caught. I understand Konstantin's concern that HADOOP-971 adds complexity to NameNode. I'd be happy if anybody comes up with an idea that removes the getDatanodeByHost bottleneck without introducing the host2DatanodeMap.
            nidaley Nigel Daley added a comment -

            +1

            Tom, Doug, lets get this committed ASAP so that trunk unit testing (and thus patch process) is no longer broken.

            nidaley Nigel Daley added a comment - +1 Tom, Doug, lets get this committed ASAP so that trunk unit testing (and thus patch process) is no longer broken.
            cutting Doug Cutting added a comment -

            I just committed this. Thanks, Hairong!

            cutting Doug Cutting added a comment - I just committed this. Thanks, Hairong!
            hadoopqa Hadoop QA added a comment -
            hadoopqa Hadoop QA added a comment - Integrated in Hadoop-Nightly #60 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/60/ )

            People

              hairong Hairong Kuang
              hairong Hairong Kuang
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: