Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16461

Regression: FileSystem cache lock parses XML within the lock

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.3.0
    • Fix Version/s: 3.3.0, 3.1.4, 3.2.2
    • Component/s: fs
    • Labels:
      None

      Description

      https://github.com/apache/hadoop/blob/2546e6ece240924af2188bb39b3954a4896e4a4f/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L3388

            fs = createFileSystem(uri, conf);
            synchronized (this) { // refetch the lock again
              FileSystem oldfs = map.get(key);
              if (oldfs != null) { // a file system is created while lock is releasing
                fs.close(); // close the new file system
                return oldfs;  // return the old file system
              }
      
              // now insert the new file system into the map
              if (map.isEmpty()
                      && !ShutdownHookManager.get().isShutdownInProgress()) {
                ShutdownHookManager.get().addShutdownHook(clientFinalizer, SHUTDOWN_HOOK_PRIORITY);
              }
              fs.key = key;
              map.put(key, fs);
              if (conf.getBoolean(
                  FS_AUTOMATIC_CLOSE_KEY, FS_AUTOMATIC_CLOSE_DEFAULT)) {
                toAutoClose.add(key);
              }
              return fs;
            }
      

      The lock now has a ShutdownHook creation, which ends up doing

      https://github.com/apache/hadoop/blob/2546e6ece240924af2188bb39b3954a4896e4a4f/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ShutdownHookManager.java#L205

          HookEntry(Runnable hook, int priority) {
            this(hook, priority,
                getShutdownTimeout(new Configuration()),
                TIME_UNIT_DEFAULT);
          }
      

      which ends up doing a "new Configuration()" within the locked section.

      This indirectly hurts the cache hit scenarios as well, since if the lock on this is held, then the other section cannot be entered either.

      https://github.com/apache/tez/blob/master/tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/sort/impl/TezSpillRecord.java#L65

      I/O Setup 0 State: BLOCKED CPU usage on sample: 6ms
      org.apache.hadoop.fs.FileSystem$Cache.getInternal(URI, Configuration, FileSystem$Cache$Key) FileSystem.java:3345
      org.apache.hadoop.fs.FileSystem$Cache.get(URI, Configuration) FileSystem.java:3320
      org.apache.hadoop.fs.FileSystem.get(URI, Configuration) FileSystem.java:479
      org.apache.hadoop.fs.FileSystem.getLocal(Configuration) FileSystem.java:435
      

      slowing down the RawLocalFileSystem when there are other threads creating HDFS FileSystem objects at the same time.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                gopalv Gopal Vijayaraghavan
                Reporter:
                gopalv Gopal Vijayaraghavan
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: