Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-3931

TestDatanodeBlockScanner#testBlockCorruptionPolicy2 is broken

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 2.0.0-alpha
    • 2.0.3-alpha
    • test
    • None
    • Reviewed

    Description

      Per Andy's comment on HDFS-3902:

      TestDatanodeBlockScanner still fails about 1/5 runs in testBlockCorruptionRecoveryPolicy2. That's due to a separate test issue also uncovered by HDFS-3828.
      The failure scenario for this one is a bit more tricky. I think I've captured the scenario below:

      • The test corrupts 2/3 replicas.
      • client reports a bad block.
      • NN asks a DN to re-replicate, and randomly picks the other corrupt replica.
      • DN notices the incoming replica is corrupt and reports it as a bad block, but does not inform the NN that re-replication failed.
      • NN keeps the block on pendingReplications.
      • BP scanner wakes up on both DNs with corrupt blocks, both report corruption. NN reports both as duplicates, one from the client and one from the DN report above.
        since block is on pendingReplications, NN does not schedule another replication.

      Attachments

        1. hdfs3931.txt
          1 kB
          Andy Isaacson
        2. hdfs3931-1.txt
          5 kB
          Andy Isaacson
        3. hdfs3931-2.txt
          3 kB
          Andy Isaacson
        4. hdfs3931-3.txt
          5 kB
          Andy Isaacson

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            adi2 Andy Isaacson
            eli Eli Collins
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment