Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-4479

logSync() with the FSNamesystem lock held in commitBlockSynchronization

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.2.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In FSNamesystem#commitBlockSynchronization of branch-1, logSync() may be called when the FSNamesystem lock is held. Similar to HDFS-4186, this may cause some performance issue.

      The following issue was observed in a cluster that was running a Hive job and was writing to 100,000 temporary files (each task is writing to 1000s of files). When this job is killed, a large number of files are left open for write. Eventually when the lease for open files expires, lease recovery is started for all these files in a very short duration of time. This causes a large number of commitBlockSynchronization where logSync is performed with the FSNamesystem lock held. This overloads the namenode resulting in slowdown.

      Since logSync is called right after the synchronization section, we can simply remove the logSync call.

      1. HDFS-4479.b1.002.patch
        2 kB
        Jing Zhao
      2. HDFS-4479.b1.001.patch
        0.6 kB
        Jing Zhao

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Jing Zhao
              Reporter:
              Jing Zhao
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development