Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15273

CacheReplicationMonitor hold lock for long time and lead to NN out of service

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: caching, namenode
    • Labels:
      None

      Description

      CacheReplicationMonitor scan Cache Directives and Cached BlockMap periodically. If we add more and more cache directives, CacheReplicationMonitor will cost very long time to rescan all of cache directives and cache blocks. Meanwhile, scan operation hold global write lock, during scan period, NameNode could not process other request.
      So I think we should warn this risk to end user who turn on CacheManager feature before improve this implement.

        private void rescan() throws InterruptedException {
          scannedDirectives = 0;
          scannedBlocks = 0;
          try {
            namesystem.writeLock();
            try {
              lock.lock();
              if (shutdown) {
                throw new InterruptedException("CacheReplicationMonitor was " +
                    "shut down.");
              }
              curScanCount = completedScanCount + 1;
            } finally {
              lock.unlock();
            }
      
            resetStatistics();
            rescanCacheDirectives();
            rescanCachedBlockMap();
            blockManager.getDatanodeManager().resetLastCachingDirectiveSentTime();
          } finally {
            namesystem.writeUnlock();
          }
        }
      

        Attachments

        1. HDFS-15273.001.patch
          7 kB
          Xiaoqiao He

          Activity

            People

            • Assignee:
              hexiaoqiao Xiaoqiao He
              Reporter:
              hexiaoqiao Xiaoqiao He
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated: