Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-7101

Add config parameter to allow JHS to alway scan user dir irrespective of modTime

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 2.10.0, 3.2.0, 3.1.1
    • None
    • None

    Description

      Currently, the JHS scan directory if the modification of directory changed:

       
          public synchronized void scanIfNeeded(FileStatus fs) {
            long newModTime = fs.getModificationTime();
            if (modTime != newModTime) {
              <... omitted some logics ...>
              // reset scanTime before scanning happens
              scanTime = System.currentTimeMillis();
              Path p = fs.getPath();
              try {
                scanIntermediateDirectory(p);
      

      This logic relies on an assumption that, the directory's modification time will be updated if a file got placed under the directory.

      However, the semantic of directory's modification time is not consistent in different FS implementations. For example, MAPREDUCE-6680 fixed some issues of truncated modification time. And HADOOP-12837 mentioned on S3, the directory's modification time is always 0.

      I think we need to revisit behavior of this logic to make it to more robustly work on different file systems.

      Attachments

        1. MAPREDUCE-7101.001.patch
          4 kB
          Arun Suresh
        2. MAPREDUCE-7101.001.patch
          4 kB
          Thomas Marqardt

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tmarquardt Thomas Marqardt
            leftnoteasy Wangda Tan
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment