Hadoop Common
  1. Hadoop Common
  2. HADOOP-1135

A block report processing may incorrectly cause the namenode to delete blocks

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.2
    • Component/s: None
    • Labels:
      None

      Description

      When a block report arrives at the namenode, the namenode goes through all the blocks on that datanode. If a block is not valid it is marked for deletion. The blocks-to-be-deleted are sent to the datanode as a response to the next heartbeat RPC. The namenode sends only 100 blocks-to-be-deleted at a time. This was introduced as part of hadoop-994. The bug is that if the number of blocks-to-be-deleted exceeds 100, then that namenode marks all the remaining blocks in the block report for deletion.

        Activity

        Hide
        Andrzej Bialecki added a comment -

        IMHO this should go to a 0.12.2 release - this looks like pretty serious issue, and at the moment it's still well isolated from other changes in the trunk.

        Show
        Andrzej Bialecki added a comment - IMHO this should go to a 0.12.2 release - this looks like pretty serious issue, and at the moment it's still well isolated from other changes in the trunk.
        Hide
        Tom White added a comment -

        I've just committed this. Thanks Dhruba!

        (I've marked it as fixed in 0.13.0, but there is still an open question as to whether this merits a 0.12.2 release.)

        Show
        Tom White added a comment - I've just committed this. Thanks Dhruba! (I've marked it as fixed in 0.13.0, but there is still an open question as to whether this merits a 0.12.2 release.)
        Hide
        dhruba borthakur added a comment -

        I agree that this could cause data loss.

        Show
        dhruba borthakur added a comment - I agree that this could cause data loss.
        Hide
        Doug Cutting added a comment -

        Does this warrant a 0.12.2 release? It sounds like it could cause data loss...

        Show
        Doug Cutting added a comment - Does this warrant a 0.12.2 release? It sounds like it could cause data loss...
        Hide
        Hairong Kuang added a comment -

        +1. The logic looks correct. This makes sure that only invalid blocks will be deleted.

        Show
        Hairong Kuang added a comment - +1. The logic looks correct. This makes sure that only invalid blocks will be deleted.
        Hide
        dhruba borthakur added a comment -

        Code uploaded for code review. Unit test coming soon.

        Show
        dhruba borthakur added a comment - Code uploaded for code review. Unit test coming soon.

          People

          • Assignee:
            dhruba borthakur
            Reporter:
            dhruba borthakur
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development