Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0-alpha
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The work on branch-20-append was to support sync, for durable HBase WALs, not append. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list like this. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do).

      For v1.x let's:

      1. Always enable the sync path (currently only enabled if dfs.support.append is set)
      2. Remove the dfs.support.append configuration option. Let's keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

      For 2.x let's

      1. Always enable the hsync/hflush path
      2. The dfs.support.appends only enables the append specific paths (since the hsync/hflush paths are now always on). Append will still default to being enabled so there is no net effect by default.
      1. hdfs-3120.txt
        21 kB
        Eli Collins
      2. hdfs-3120.txt
        18 kB
        Eli Collins

        Issue Links

          Activity

          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Let's add a new dfs.support.hsync option ...

          There is a little problem that sync in 1.x and hflush in 2.x are different. They different methods and slightly different semantic. Do we really need this new option? How about change dfs.support.append for only append. Sync/hflush is always enabled.

          Show
          Tsz Wo Nicholas Sze added a comment - > Let's add a new dfs.support.hsync option ... There is a little problem that sync in 1.x and hflush in 2.x are different. They different methods and slightly different semantic. Do we really need this new option? How about change dfs.support.append for only append. Sync/hflush is always enabled.
          Hide
          Eli Collins added a comment -

          For 1.x:

          • Add a dfs.support.sync option and enable it by default

          For 2.x:

          • Make hsync/hflush behavior independent of whether dfs.support.appends is enabled, ie you can turn append off and hsync/hflush still work

          Note this does not add a dfs.support.sync option to 2.x This would be useful to people who want to disable both append and sync, eg to make a couple paths cheaper. I think we should not add this option unless it's requested / needed in the future. Note that this code path is already enabled by default, so the change is just that people can currently disable all sync/append code paths by disabling append, now they would just be disabling the append-specific code paths.

          Show
          Eli Collins added a comment - For 1.x: Add a dfs.support.sync option and enable it by default For 2.x: Make hsync/hflush behavior independent of whether dfs.support.appends is enabled, ie you can turn append off and hsync/hflush still work Note this does not add a dfs.support.sync option to 2.x This would be useful to people who want to disable both append and sync, eg to make a couple paths cheaper. I think we should not add this option unless it's requested / needed in the future. Note that this code path is already enabled by default, so the change is just that people can currently disable all sync/append code paths by disabling append, now they would just be disabling the append-specific code paths.
          Hide
          Eli Collins added a comment -

          @Nicholas, mid air collision! What do you think of my last comment?

          Show
          Eli Collins added a comment - @Nicholas, mid air collision! What do you think of my last comment?
          Hide
          Eli Collins added a comment -

          @Nicholas, I'm OK always enabling the sync paths on 1.X as well, currently they're not enabled by default because append is not enabled by default.

          Show
          Eli Collins added a comment - @Nicholas, I'm OK always enabling the sync paths on 1.X as well, currently they're not enabled by default because append is not enabled by default.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > @Nicholas, I'm OK always enabling the sync paths on 1.X as well, ...

          That's great! Thanks.

          Show
          Tsz Wo Nicholas Sze added a comment - > @Nicholas, I'm OK always enabling the sync paths on 1.X as well, ... That's great! Thanks.
          Hide
          Eli Collins added a comment -

          Forgot to mention, given that we know there are data loss issues with append, why don't we remove the ability to enable it at the same time? It's an incompatible change, however there's also no reason in letting users shoot themselves in the foot.

          Show
          Eli Collins added a comment - Forgot to mention, given that we know there are data loss issues with append, why don't we remove the ability to enable it at the same time? It's an incompatible change, however there's also no reason in letting users shoot themselves in the foot.
          Hide
          Eli Collins added a comment -

          This obviously only pertains to Hadoop 1.x btw.

          Show
          Eli Collins added a comment - This obviously only pertains to Hadoop 1.x btw.
          Hide
          Eli Collins added a comment -

          Latest proposal:

          For 1.x:

          • Always enable the sync path (currently only enabled if dfs.support.append is set)
          • Remove the dfs.support.append configuration option. Let's keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

          For 2.x:

          • Always enable the hsync/hflush path
          • The dfs.support.appends only enables the append specific paths (since the hsync/hflush paths are now always on). Append will still default to being enabled so there is no net effect by default

          Sound good?

          Show
          Eli Collins added a comment - Latest proposal: For 1.x: Always enable the sync path (currently only enabled if dfs.support.append is set) Remove the dfs.support.append configuration option. Let's keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back For 2.x: Always enable the hsync/hflush path The dfs.support.appends only enables the append specific paths (since the hsync/hflush paths are now always on). Append will still default to being enabled so there is no net effect by default Sound good?
          Hide
          Todd Lipcon added a comment -

          sounds good to me, +1 for that proposal

          Show
          Todd Lipcon added a comment - sounds good to me, +1 for that proposal
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 sounds good to me too.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 sounds good to me too.
          Hide
          Eli Collins added a comment -

          Thanks guys. I filed a separate jira (HADOOP-8230) for the 1.x change so I can clearly separate it from the 2.x change, and eg indicate that it is an incompatible change.

          I've filed HBASE-5676 to share the good news.

          Show
          Eli Collins added a comment - Thanks guys. I filed a separate jira ( HADOOP-8230 ) for the 1.x change so I can clearly separate it from the 2.x change, and eg indicate that it is an incompatible change. I've filed HBASE-5676 to share the good news.
          Hide
          Eli Collins added a comment -

          Patch attached.

          • Enables non-append hsync/hflush paths by default
          • Adds a dfs.support.appends entry to hdfs-default.xml
          • Removes explicit enablement of dfs.support.appends from tests so we cover that append
            is enabled by default
          Show
          Eli Collins added a comment - Patch attached. Enables non-append hsync/hflush paths by default Adds a dfs.support.appends entry to hdfs-default.xml Removes explicit enablement of dfs.support.appends from tests so we cover that append is enabled by default
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12520627/hdfs-3120.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 51 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hdfs.TestFileAppend4

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2127//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2127//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520627/hdfs-3120.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 51 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hdfs.TestFileAppend4 +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2127//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2127//console This message is automatically generated.
          Hide
          Eli Collins added a comment -

          The TestFileAppend4 timeout is unrelated.

          Show
          Eli Collins added a comment - The TestFileAppend4 timeout is unrelated.
          Hide
          Todd Lipcon added a comment -

          Patch looks OK. Can you verify TestFileAppend4 locally before committing? Seems suspect that an append-related test would fail with this append-related patch.

          Show
          Todd Lipcon added a comment - Patch looks OK. Can you verify TestFileAppend4 locally before committing? Seems suspect that an append-related test would fail with this append-related patch.
          Hide
          Eli Collins added a comment -

          Yea, it passes for me locally, given that the code paths don't change - hscync/flush and append are already enabled by default - shouldn't change the execution of this test. I'll re-run jenkins for sanity. Attached patch had to rebase on trunk since TestDatanodeRestart was moved.

          Show
          Eli Collins added a comment - Yea, it passes for me locally, given that the code paths don't change - hscync/flush and append are already enabled by default - shouldn't change the execution of this test. I'll re-run jenkins for sanity. Attached patch had to rebase on trunk since TestDatanodeRestart was moved.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #1975 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1975/)
          Previous commit was for HDFS-3120, fixing up CHANGES.txt (Revision 1308615)

          Result = SUCCESS
          eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1308615
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #1975 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1975/ ) Previous commit was for HDFS-3120 , fixing up CHANGES.txt (Revision 1308615) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1308615 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #2049 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2049/)
          Previous commit was for HDFS-3120, fixing up CHANGES.txt (Revision 1308615)

          Result = SUCCESS
          eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1308615
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2049 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2049/ ) Previous commit was for HDFS-3120 , fixing up CHANGES.txt (Revision 1308615) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1308615 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12521073/hdfs-3120.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 51 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2165//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521073/hdfs-3120.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 51 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2165//console This message is automatically generated.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #1987 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1987/)
          Previous commit was for HDFS-3120, fixing up CHANGES.txt (Revision 1308615)

          Result = SUCCESS
          eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1308615
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #1987 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1987/ ) Previous commit was for HDFS-3120 , fixing up CHANGES.txt (Revision 1308615) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1308615 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12521073/hdfs-3120.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 51 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2164//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2164//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521073/hdfs-3120.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 51 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2164//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2164//console This message is automatically generated.
          Hide
          Eli Collins added a comment -

          Test failure is due to port in use, unrelated.

          Show
          Eli Collins added a comment - Test failure is due to port in use, unrelated.
          Hide
          Eli Collins added a comment -

          I've committed this and merged to branch-2. Thanks for the review Todd!

          Show
          Eli Collins added a comment - I've committed this and merged to branch-2. Thanks for the review Todd!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1004 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1004/)
          Previous commit was for HDFS-3120, fixing up CHANGES.txt (Revision 1308615)

          Result = FAILURE
          eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1308615
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1004 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1004/ ) Previous commit was for HDFS-3120 , fixing up CHANGES.txt (Revision 1308615) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1308615 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1039 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1039/)
          Previous commit was for HDFS-3120, fixing up CHANGES.txt (Revision 1308615)

          Result = FAILURE
          eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1308615
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1039 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1039/ ) Previous commit was for HDFS-3120 , fixing up CHANGES.txt (Revision 1308615) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1308615 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt

            People

            • Assignee:
              Eli Collins
              Reporter:
              Eli Collins
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development