Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9600

do not check replication if the block is under construction

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 2.8.0, 2.7.3, 2.6.4, 3.0.0-alpha1
    • None
    • None

    Description

      When appending a file, we will update pipeline to bump a new GS and the old GS will be considered as out of date. When changing GS, in BlockInfo.setGenerationStampAndVerifyReplicas we will remove replicas having old GS which means we will remove all replicas because no DN has new GS until the block with new GS is added to blockMaps again by DatanodeProtocol.blockReceivedAndDeleted.

      If we check replication of this block before it is added back, it will be regarded as missing. The probability is low but if there are decommissioning nodes the DecommissionManager.Monitor will scan all blocks belongs to decommissioning nodes with a very fast speed so the probability of finding missing block is very high but actually they are not missing.

      Furthermore, after closing the appended file, in FSNamesystem.finalizeINodeFileUnderConstruction, it will checkReplication. If some of nodes are decommissioning, this block with new GS will be added to UnderReplicatedBlocks map so there are two blocks with same ID in this map, one is in QUEUE_WITH_CORRUPT_BLOCKS and the other is in QUEUE_HIGHEST_PRIORITY or QUEUE_UNDER_REPLICATED. And there will be many missing blocks warning in NameNode website but there is no corrupt files...

      Therefore, I think the solution is we should not check replication if the block is under construction. We only check complete blocks.

      Attachments

        1. HDFS-9600-v4.patch
          7 kB
          Phil Yang
        2. HDFS-9600-v3.patch
          7 kB
          Phil Yang
        3. HDFS-9600-v2.patch
          5 kB
          Phil Yang
        4. HDFS-9600-v1.patch
          0.9 kB
          Phil Yang
        5. HDFS-9600-branch-2.patch
          7 kB
          Phil Yang
        6. HDFS-9600-branch-2.7.patch
          7 kB
          Phil Yang
        7. HDFS-9600-branch-2.6.patch
          6 kB
          Phil Yang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            yangzhe1991 Phil Yang
            yangzhe1991 Phil Yang
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment