Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8031 Follow-on work for erasure coding phase I (striping layout)
  3. HDFS-10858

FBR processing may generate incorrect reportedBlock-blockGroup mapping

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 3.0.0-alpha1
    • 3.0.0-alpha2
    • erasure-coding
    • None

    Description

      In BlockManager#reportDiffSorted:

          } else if (reportedState == ReplicaState.FINALIZED &&
                     (storedBlock.findStorageInfo(storageInfo) == -1 ||
                      corruptReplicas.isReplicaCorrupt(storedBlock, dn))) {
            // Add replica if appropriate. If the replica was previously corrupt
            // but now okay, it might need to be updated.
            toAdd.add(new BlockInfoToAdd(storedBlock, replica));
          }
      

      "new BlockInfoToAdd(storedBlock, replica)" is wrong because "replica" (i.e., the reported block) is a reused object provided by BlockListAsLongs#iterator. Later this object is reused by directly changing its ID/GS. Thus addStoredBlock can get wrong (reportedBlock, stored-BlockInfo) mapping. For EC the reported block is used to calculate the internal block index. Thus the bug can completely corrupt the EC block group internal states.

      Attachments

        1. HDFS-10858.000.patch
          6 kB
          Jing Zhao

        Issue Links

          Activity

            People

              jingzhao Jing Zhao
              jingzhao Jing Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: