Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11576

Block recovery will fail indefinitely if recovery time > heartbeat interval

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.7.1, 2.7.2, 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2
    • 3.0.0, 2.9.1
    • datanode, hdfs, namenode
    • None
    • Reviewed

    Description

      Block recovery will fail indefinitely if the time to recover a block is always longer than the heartbeat interval. Scenario:
      1. DN sends heartbeat
      2. NN sends a recovery command to DN, recoveryID=X
      3. DN starts recovery
      4. DN sends another heartbeat
      5. NN sends a recovery command to DN, recoveryID=X+1
      6. DN calls commitBlockSyncronization after succeeding with first recovery to NN, which fails because X < X+1
      ...

      Attachments

        1. HDFS-11576.001.patch
          18 kB
          Lukas Majercak
        2. HDFS-11576.002.patch
          24 kB
          Lukas Majercak
        3. HDFS-11576.003.patch
          25 kB
          Lukas Majercak
        4. HDFS-11576.004.patch
          24 kB
          Lukas Majercak
        5. HDFS-11576.005.patch
          26 kB
          Lukas Majercak
        6. HDFS-11576.006.patch
          26 kB
          Lukas Majercak
        7. HDFS-11576.007.patch
          26 kB
          Lukas Majercak
        8. HDFS-11576.008.patch
          15 kB
          Lukas Majercak
        9. HDFS-11576.009.patch
          23 kB
          Lukas Majercak
        10. HDFS-11576.010.patch
          24 kB
          Lukas Majercak
        11. HDFS-11576.011.patch
          24 kB
          Lukas Majercak
        12. HDFS-11576.012.patch
          22 kB
          Lukas Majercak
        13. HDFS-11576.013.patch
          23 kB
          Lukas Majercak
        14. HDFS-11576.014.patch
          23 kB
          Lukas Majercak
        15. HDFS-11576.015.patch
          24 kB
          Lukas Majercak
        16. HDFS-11576.repro.patch
          4 kB
          Lukas Majercak
        17. HDFS-11576-branch-2.00.patch
          25 kB
          Christopher Douglas
        18. HDFS-11576-branch-2.01.patch
          24 kB
          Christopher Douglas

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            lukmajercak Lukas Majercak
            lukmajercak Lukas Majercak
            Votes:
            1 Vote for this issue
            Watchers:
            20 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment