Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-4479

logSync() with the FSNamesystem lock held in commitBlockSynchronization

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.2.0
    • 1.2.0
    • None
    • None
    • Reviewed

    Description

      In FSNamesystem#commitBlockSynchronization of branch-1, logSync() may be called when the FSNamesystem lock is held. Similar to HDFS-4186, this may cause some performance issue.

      The following issue was observed in a cluster that was running a Hive job and was writing to 100,000 temporary files (each task is writing to 1000s of files). When this job is killed, a large number of files are left open for write. Eventually when the lease for open files expires, lease recovery is started for all these files in a very short duration of time. This causes a large number of commitBlockSynchronization where logSync is performed with the FSNamesystem lock held. This overloads the namenode resulting in slowdown.

      Since logSync is called right after the synchronization section, we can simply remove the logSync call.

      Attachments

        1. HDFS-4479.b1.002.patch
          2 kB
          Jing Zhao
        2. HDFS-4479.b1.001.patch
          0.6 kB
          Jing Zhao

        Issue Links

          Activity

            People

              jingzhao Jing Zhao
              jingzhao Jing Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: