Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15273

CacheReplicationMonitor hold lock for long time and lead to NN out of service

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.4.0
    • caching, namenode
    • None

    Description

      CacheReplicationMonitor scan Cache Directives and Cached BlockMap periodically. If we add more and more cache directives, CacheReplicationMonitor will cost very long time to rescan all of cache directives and cache blocks. Meanwhile, scan operation hold global write lock, during scan period, NameNode could not process other request.
      So I think we should warn this risk to end user who turn on CacheManager feature before improve this implement.

        private void rescan() throws InterruptedException {
          scannedDirectives = 0;
          scannedBlocks = 0;
          try {
            namesystem.writeLock();
            try {
              lock.lock();
              if (shutdown) {
                throw new InterruptedException("CacheReplicationMonitor was " +
                    "shut down.");
              }
              curScanCount = completedScanCount + 1;
            } finally {
              lock.unlock();
            }
      
            resetStatistics();
            rescanCacheDirectives();
            rescanCachedBlockMap();
            blockManager.getDatanodeManager().resetLastCachingDirectiveSentTime();
          } finally {
            namesystem.writeUnlock();
          }
        }
      

      Attachments

        1. HDFS-15273.003.patch
          8 kB
          Xiaoqiao He
        2. HDFS-15273.002.patch
          8 kB
          Xiaoqiao He
        3. HDFS-15273.001.patch
          7 kB
          Xiaoqiao He

        Activity

          People

            hexiaoqiao Xiaoqiao He
            hexiaoqiao Xiaoqiao He
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: