Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-3931

TestDatanodeBlockScanner#testBlockCorruptionPolicy2 is broken

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 2.0.3-alpha
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Per Andy's comment on HDFS-3902:

      TestDatanodeBlockScanner still fails about 1/5 runs in testBlockCorruptionRecoveryPolicy2. That's due to a separate test issue also uncovered by HDFS-3828.
      The failure scenario for this one is a bit more tricky. I think I've captured the scenario below:

      • The test corrupts 2/3 replicas.
      • client reports a bad block.
      • NN asks a DN to re-replicate, and randomly picks the other corrupt replica.
      • DN notices the incoming replica is corrupt and reports it as a bad block, but does not inform the NN that re-replication failed.
      • NN keeps the block on pendingReplications.
      • BP scanner wakes up on both DNs with corrupt blocks, both report corruption. NN reports both as duplicates, one from the client and one from the DN report above.
        since block is on pendingReplications, NN does not schedule another replication.

        Attachments

        1. hdfs3931.txt
          1 kB
          Andy Isaacson
        2. hdfs3931-1.txt
          5 kB
          Andy Isaacson
        3. hdfs3931-2.txt
          3 kB
          Andy Isaacson
        4. hdfs3931-3.txt
          5 kB
          Andy Isaacson

        Issue Links

          Activity

            People

            • Assignee:
              adi2 Andy Isaacson
              Reporter:
              eli Eli Collins

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment