Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15386

ReplicaNotFoundException keeps happening in DN after removing multiple DN's data directories

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.4, 3.2.2, 2.10.1, 3.3.1, 3.4.0
    • None
    • None

    Description

      When removing volumes, we need to invalidate all the blocks in the volumes. In the following code (FsDatasetImpl), we keep the blocks that will be invalidate in blkToInvalidate map. However as the key of the map is bpid (Block Pool ID), it will be overwritten by other removed volumes. As a result, the map will have only the blocks of the last volume we are removing, and invalidate only them:

      for (String bpid : volumeMap.getBlockPoolList()) {
        List<ReplicaInfo> blocks = new ArrayList<>();
        for (Iterator<ReplicaInfo> it =
              volumeMap.replicas(bpid).iterator(); it.hasNext();) {
          ReplicaInfo block = it.next();
          final StorageLocation blockStorageLocation =
              block.getVolume().getStorageLocation();
          LOG.trace("checking for block " + block.getBlockId() +
              " with storageLocation " + blockStorageLocation);
          if (blockStorageLocation.equals(sdLocation)) {
            blocks.add(block);
            it.remove();
          }
        }
        blkToInvalidate.put(bpid, blocks);
      }
      

      https://github.com/apache/hadoop/blob/704409d53bf7ebf717a3c2e988ede80f623bbad3/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java#L580-L595

      Attachments

        Activity

          People

            brfrn169 Toshihiro Suzuki
            brfrn169 Toshihiro Suzuki
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: