Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10285 Storage Policy Satisfier in HDFS
  3. HDFS-16484

[SPS]: Fix an infinite loop bug in SPSPathIdProcessor thread

    XMLWordPrintableJSON

Details

    Description

      Currently, we ran SPS in our cluster and found this log. The SPSPathIdProcessor thread enters an infinite loop and prints the same log all the time.

      In SPSPathIdProcessor thread, if it get a inodeId which path does not exist, then the SPSPathIdProcessor thread entry infinite loop and can't work normally. 

      The reason is that #ctxt.getNextSPSPath() get a inodeId which path does not exist. The inodeId will not be set to null, causing the thread hold this inodeId forever.

      public void run() {
        LOG.info("Starting SPSPathIdProcessor!.");
        Long startINode = null;
        while (ctxt.isRunning()) {
          try {
            if (!ctxt.isInSafeMode()) {
              if (startINode == null) {
                startINode = ctxt.getNextSPSPath();
              } // else same id will be retried
              if (startINode == null) {
                // Waiting for SPS path
                Thread.sleep(3000);
              } else {
                ctxt.scanAndCollectFiles(startINode);
                // check if directory was empty and no child added to queue
                DirPendingWorkInfo dirPendingWorkInfo =
                    pendingWorkForDirectory.get(startINode);
                if (dirPendingWorkInfo != null
                    && dirPendingWorkInfo.isDirWorkDone()) {
                  ctxt.removeSPSHint(startINode);
                  pendingWorkForDirectory.remove(startINode);
                }
              }
              startINode = null; // Current inode successfully scanned.
            }
          } catch (Throwable t) {
            String reClass = t.getClass().getName();
            if (InterruptedException.class.getName().equals(reClass)) {
              LOG.info("SPSPathIdProcessor thread is interrupted. Stopping..");
              break;
            }
            LOG.warn("Exception while scanning file inodes to satisfy the policy",
                t);
            try {
              Thread.sleep(3000);
            } catch (InterruptedException e) {
              LOG.info("Interrupted while waiting in SPSPathIdProcessor", t);
              break;
            }
          }
        }
      } 

       

       

      Attachments

        Issue Links

          Activity

            People

              qinyuren qinyuren
              qinyuren qinyuren
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 10m
                  4h 10m