Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14642

processMisReplicatedBlocks does not return correct processed count

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.2.0
    • Fix Version/s: 3.3.0
    • Component/s: namenode
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      HDFS-14053 introduced a method "processMisReplicatedBlocks" to the blockManager, and it is used by fsck to schedule mis-replicated blocks for replication.

      The method should return a the number of blocks it processed, but it always returns zero as "processed" is never incremented in the method.

      It should also drop and re-take the write lock every "numBlocksPerIteration" but as processed is never incremented, it will never drop and re-take the write lock, giving potential for holding the write lock for a long time.

      public int processMisReplicatedBlocks(List<BlockInfo> blocks) {
        int processed = 0;
        Iterator<BlockInfo> iter = blocks.iterator();
      
        try {
          while (isPopulatingReplQueues() && namesystem.isRunning()
                  && !Thread.currentThread().isInterrupted()
                  && iter.hasNext()) {
            int limit = processed + numBlocksPerIteration;
            namesystem.writeLockInterruptibly();
            try {
              while (iter.hasNext() && processed < limit) {
                BlockInfo blk = iter.next();
                MisReplicationResult r = processMisReplicatedBlock(blk);
                LOG.debug("BLOCK* processMisReplicatedBlocks: " +
                        "Re-scanned block {}, result is {}", blk, r);
              }
            } finally {
              namesystem.writeUnlock();
            }
          }
        } catch (InterruptedException ex) {
          LOG.info("Caught InterruptedException while scheduling replication work" +
                  " for mis-replicated blocks");
          Thread.currentThread().interrupt();
        }
      
        return processed;
      }

      Due to this, fsck causes a warning to be logged in the NN for every mis-replicated file it schedules replication for, as it checks the processed count:

      2019-07-10 15:46:14,790 WARN namenode.NameNode: Fsck: Block manager is able to process only 0 mis-replicated blocks (Total count : 1 ) for path /...

        Attachments

        1. HDFS-14642.001.patch
          2 kB
          Stephen O'Donnell

          Issue Links

            Activity

              People

              • Assignee:
                sodonnell Stephen O'Donnell
                Reporter:
                sodonnell Stephen O'Donnell
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: