Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13770

dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.7
    • Fix Version/s: 2.10.0, 2.8.6, 2.9.3
    • Component/s: hdfs
    • Labels:
      None
    • Target Version/s:

      Description

      Missing blocks (with replication factor 1) metric is not always decreased when file is deleted.

      If a file is deleted, the remove function of UnderReplicatedBlocks can be called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called with the wrong priority the corruptReplOneBlocks metric is not decreased, however the block is removed from the priority queue which contains it.

      The corresponding code:

      /** remove a block from a under replication queue */
      synchronized boolean remove(BlockInfo block,
       int oldReplicas,
       int oldReadOnlyReplicas,
       int decommissionedReplicas,
       int oldExpectedReplicas) {
       final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
       decommissionedReplicas, oldExpectedReplicas);
       boolean removedBlock = remove(block, priLevel);
       if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
       oldExpectedReplicas == 1 &&
       removedBlock) {
       corruptReplOneBlocks--;
       assert corruptReplOneBlocks >= 0 :
       "Number of corrupt blocks with replication factor 1 " +
       "should be non-negative";
       }
       return removedBlock;
      }
      
      /**
       * Remove a block from the under replication queues.
       *
       * The priLevel parameter is a hint of which queue to query
       * first: if negative or >= \{@link #LEVEL} this shortcutting
       * is not attmpted.
       *
       * If the block is not found in the nominated queue, an attempt is made to
       * remove it from all queues.
       *
       * <i>Warning:</i> This is not a synchronized method.
       * @param block block to remove
       * @param priLevel expected privilege level
       * @return true if the block was found and removed from one of the priority queues
       */
      boolean remove(BlockInfo block, int priLevel) {
       if(priLevel >= 0 && priLevel < LEVEL
       && priorityQueues.get(priLevel).remove(block)) {
       NameNode.blockStateChangeLog.debug(
       "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
       " from priority queue {}", block, priLevel);
       return true;
       } else {
       // Try to remove the block from all queues if the block was
       // not found in the queue for the given priority level.
       for (int i = 0; i < LEVEL; i++) {
       if (i != priLevel && priorityQueues.get(i).remove(block)) {
       NameNode.blockStateChangeLog.debug(
       "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
       " {} from priority queue {}", block, i);
       return true;
       }
       }
       }
       return false;
      }
      

      It is already fixed on trunk by this jira: HDFS-10999, but that ticket introduces new metrics, which I think should't be backported to branch-2.

       

        Attachments

        1. HDFS-13770-branch-2.001.patch
          7 kB
          Kitti Nanasi
        2. HDFS-13770-branch-2.002.patch
          9 kB
          Kitti Nanasi
        3. HDFS-13770-branch-2.003.patch
          9 kB
          Kitti Nanasi
        4. HDFS-13770-branch-2.004.patch
          9 kB
          Wei-Chiu Chuang
        5. HDFS-13770-branch-2-005.patch
          9 kB
          Wei-Chiu Chuang

          Activity

            People

            • Assignee:
              knanasi Kitti Nanasi
              Reporter:
              knanasi Kitti Nanasi
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: