Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-3280

DFSOutputStream.sync should not be synchronized

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 2.0.0-alpha
    • Component/s: hdfs-client
    • Labels:
      None

      Description

      HDFS-895 added an optimization to make hflush() much faster by unsynchronizing it. But, we forgot to un-synchronize the deprecated sync() wrapper method. This makes the HBase WAL really slow on 0.23+ since it doesn't take advantage of HDFS-895 anymore.

      1. hdfs-3280.txt
        0.8 kB
        Todd Lipcon

        Activity

        Hide
        Todd Lipcon added a comment -

        Committed to branch-2 (as noted above, this problem isn't present in trunk since sync() was removed).

        Thanks also to Matteo Bertozzi and Jon Hsieh for their help tracking this issue down.

        Show
        Todd Lipcon added a comment - Committed to branch-2 (as noted above, this problem isn't present in trunk since sync() was removed). Thanks also to Matteo Bertozzi and Jon Hsieh for their help tracking this issue down.
        Hide
        Todd Lipcon added a comment -

        Ah, so this explains what you guys thought might be an interaction with Nagle?

        Yep, turned out to be much simpler

        The patch failed on Hudson due to HDFS-3034 having removed the deprecated method. I'll commit this based on Aaron's +1 and based on my manual stress testing using HBase's HLog class, which uses this method.

        No unit tests since it's hard to unit test for performance, and the hflush equivalent is already tested by TestMultithreadedHflush

        Show
        Todd Lipcon added a comment - Ah, so this explains what you guys thought might be an interaction with Nagle? Yep, turned out to be much simpler The patch failed on Hudson due to HDFS-3034 having removed the deprecated method. I'll commit this based on Aaron's +1 and based on my manual stress testing using HBase's HLog class, which uses this method. No unit tests since it's hard to unit test for performance, and the hflush equivalent is already tested by TestMultithreadedHflush
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522653/hdfs-3280.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2277//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522653/hdfs-3280.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2277//console This message is automatically generated.
        Hide
        Aaron T. Myers added a comment -

        +1 pending Jenkins.

        Show
        Aaron T. Myers added a comment - +1 pending Jenkins.
        Hide
        Andrew Purtell added a comment -

        Ah, so this explains what you guys thought might be an interaction with Nagle?

        Show
        Andrew Purtell added a comment - Ah, so this explains what you guys thought might be an interaction with Nagle?
        Hide
        Todd Lipcon added a comment -

        Verified that this increased my benchmark performance by a factor of two.

        Show
        Todd Lipcon added a comment - Verified that this increased my benchmark performance by a factor of two.
        Hide
        Todd Lipcon added a comment -

        Trivial patch should fix the issue. I'll verify with an HLog benchmark on a cluster as well.

        This is definitely correct since the method is just a one-line call to another method which is unsynchronized.

        Show
        Todd Lipcon added a comment - Trivial patch should fix the issue. I'll verify with an HLog benchmark on a cluster as well. This is definitely correct since the method is just a one-line call to another method which is unsynchronized.

          People

          • Assignee:
            Todd Lipcon
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development