Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16461

Regression: FileSystem cache lock parses XML within the lock

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.0
    • 3.3.0, 3.1.4, 3.2.2
    • fs
    • None

    Description

      https://github.com/apache/hadoop/blob/2546e6ece240924af2188bb39b3954a4896e4a4f/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L3388

            fs = createFileSystem(uri, conf);
            synchronized (this) { // refetch the lock again
              FileSystem oldfs = map.get(key);
              if (oldfs != null) { // a file system is created while lock is releasing
                fs.close(); // close the new file system
                return oldfs;  // return the old file system
              }
      
              // now insert the new file system into the map
              if (map.isEmpty()
                      && !ShutdownHookManager.get().isShutdownInProgress()) {
                ShutdownHookManager.get().addShutdownHook(clientFinalizer, SHUTDOWN_HOOK_PRIORITY);
              }
              fs.key = key;
              map.put(key, fs);
              if (conf.getBoolean(
                  FS_AUTOMATIC_CLOSE_KEY, FS_AUTOMATIC_CLOSE_DEFAULT)) {
                toAutoClose.add(key);
              }
              return fs;
            }
      

      The lock now has a ShutdownHook creation, which ends up doing

      https://github.com/apache/hadoop/blob/2546e6ece240924af2188bb39b3954a4896e4a4f/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ShutdownHookManager.java#L205

          HookEntry(Runnable hook, int priority) {
            this(hook, priority,
                getShutdownTimeout(new Configuration()),
                TIME_UNIT_DEFAULT);
          }
      

      which ends up doing a "new Configuration()" within the locked section.

      This indirectly hurts the cache hit scenarios as well, since if the lock on this is held, then the other section cannot be entered either.

      https://github.com/apache/tez/blob/master/tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/sort/impl/TezSpillRecord.java#L65

      I/O Setup 0 State: BLOCKED CPU usage on sample: 6ms
      org.apache.hadoop.fs.FileSystem$Cache.getInternal(URI, Configuration, FileSystem$Cache$Key) FileSystem.java:3345
      org.apache.hadoop.fs.FileSystem$Cache.get(URI, Configuration) FileSystem.java:3320
      org.apache.hadoop.fs.FileSystem.get(URI, Configuration) FileSystem.java:479
      org.apache.hadoop.fs.FileSystem.getLocal(Configuration) FileSystem.java:435
      

      slowing down the RawLocalFileSystem when there are other threads creating HDFS FileSystem objects at the same time.

      Attachments

        Issue Links

          Activity

            People

              gopalv Gopal Vijayaraghavan
              gopalv Gopal Vijayaraghavan
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: