Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14642

processMisReplicatedBlocks does not return correct processed count

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.3.0
    • namenode
    • None
    • Reviewed

    Description

      HDFS-14053 introduced a method "processMisReplicatedBlocks" to the blockManager, and it is used by fsck to schedule mis-replicated blocks for replication.

      The method should return a the number of blocks it processed, but it always returns zero as "processed" is never incremented in the method.

      It should also drop and re-take the write lock every "numBlocksPerIteration" but as processed is never incremented, it will never drop and re-take the write lock, giving potential for holding the write lock for a long time.

      public int processMisReplicatedBlocks(List<BlockInfo> blocks) {
        int processed = 0;
        Iterator<BlockInfo> iter = blocks.iterator();
      
        try {
          while (isPopulatingReplQueues() && namesystem.isRunning()
                  && !Thread.currentThread().isInterrupted()
                  && iter.hasNext()) {
            int limit = processed + numBlocksPerIteration;
            namesystem.writeLockInterruptibly();
            try {
              while (iter.hasNext() && processed < limit) {
                BlockInfo blk = iter.next();
                MisReplicationResult r = processMisReplicatedBlock(blk);
                LOG.debug("BLOCK* processMisReplicatedBlocks: " +
                        "Re-scanned block {}, result is {}", blk, r);
              }
            } finally {
              namesystem.writeUnlock();
            }
          }
        } catch (InterruptedException ex) {
          LOG.info("Caught InterruptedException while scheduling replication work" +
                  " for mis-replicated blocks");
          Thread.currentThread().interrupt();
        }
      
        return processed;
      }

      Due to this, fsck causes a warning to be logged in the NN for every mis-replicated file it schedules replication for, as it checks the processed count:

      2019-07-10 15:46:14,790 WARN namenode.NameNode: Fsck: Block manager is able to process only 0 mis-replicated blocks (Total count : 1 ) for path /...

      Attachments

        1. HDFS-14642.001.patch
          2 kB
          Stephen O'Donnell

        Issue Links

          Activity

            People

              sodonnell Stephen O'Donnell
              sodonnell Stephen O'Donnell
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: