Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10285 Storage Policy Satisfier in HDFS
  3. HDFS-16484

[SPS]: Fix an infinite loop bug in SPSPathIdProcessor thread

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Currently, we ran SPS in our cluster and found this log. The SPSPathIdProcessor thread enters an infinite loop and prints the same log all the time.

      In SPSPathIdProcessor thread, if it get a inodeId which path does not exist, then the SPSPathIdProcessor thread entry infinite loop and can't work normally. 

      The reason is that #ctxt.getNextSPSPath() get a inodeId which path does not exist. The inodeId will not be set to null, causing the thread hold this inodeId forever.

      public void run() {
        LOG.info("Starting SPSPathIdProcessor!.");
        Long startINode = null;
        while (ctxt.isRunning()) {
          try {
            if (!ctxt.isInSafeMode()) {
              if (startINode == null) {
                startINode = ctxt.getNextSPSPath();
              } // else same id will be retried
              if (startINode == null) {
                // Waiting for SPS path
                Thread.sleep(3000);
              } else {
                ctxt.scanAndCollectFiles(startINode);
                // check if directory was empty and no child added to queue
                DirPendingWorkInfo dirPendingWorkInfo =
                    pendingWorkForDirectory.get(startINode);
                if (dirPendingWorkInfo != null
                    && dirPendingWorkInfo.isDirWorkDone()) {
                  ctxt.removeSPSHint(startINode);
                  pendingWorkForDirectory.remove(startINode);
                }
              }
              startINode = null; // Current inode successfully scanned.
            }
          } catch (Throwable t) {
            String reClass = t.getClass().getName();
            if (InterruptedException.class.getName().equals(reClass)) {
              LOG.info("SPSPathIdProcessor thread is interrupted. Stopping..");
              break;
            }
            LOG.warn("Exception while scanning file inodes to satisfy the policy",
                t);
            try {
              Thread.sleep(3000);
            } catch (InterruptedException e) {
              LOG.info("Interrupted while waiting in SPSPathIdProcessor", t);
              break;
            }
          }
        }
      } 

       

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            qinyuren qinyuren
            qinyuren qinyuren
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 4h 10m
                4h 10m

                Slack

                  Issue deployment