Accumulo
  1. Accumulo
  2. ACCUMULO-1637

Update HDFS append/sync precondition check for Hadoop 1.2

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 1.5.0
    • Fix Version/s: 1.5.1, 1.6.0
    • Component/s: tserver
    • Labels:
      None

      Description

      Apache Hadoop 1.2.0 ships with the durable sync enabled by default and the support append option marked as obsolete. Because of this, the check inside of TabletServer, meant to ensure that the HDFS WAL can function properly, incorrectly fails as it doesn't know that dfs.durable.sync is on by default.

      This can be worked around by specifying the old durable sync property in hdfs-site.xml:

      <property>
        <name>dfs.durable.sync</name>
        <value>true</value>
      </property>
      

      I'm not sure how to best way to address the differences between the newer and older versions of Hadoop and their differing append support.

      Thanks to Carlos Mundi for pointing this out on user@a.a.o

      Using this table to track the presence of these variables and their default from hdfs/o/a/h/h/DFSConfigKeys and from the codebase when there is no config parameter for it.

      Version DFSConfigKeys.DFS_SUPPORT_APPEND_KEY DFSConfigKeys.DFS_SUPPORT_APPEND_DEFAULT "dfs.durable.sync" Specific Configuration Required
      0.20.205.0 defined false not present yes
      0.23.x defined true not present no
      1.0.x defined false not present yes
      1.1.X not present absent implicit "true" no
      1.2.X not present absent implicit "true" no
      2.0.x defined true not present no
      2.1.x defined true not present no
      2.2.0 defined true not present no

        Activity

        Hide
        ASF subversion and git services added a comment -

        Commit ab6779c30c0308b5905e894c74fce24f6aca7679 in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=ab6779c ]

        ACCUMULO-1637 Update the HDFS configuration check in VolumnManagerImpl
        from the one in TabletServer.

        Looks like I missed the movement of the configuration validation code in
        1.6.0 from TabletServer to VolumeManagerImpl. Updated the check, and
        fixed the unit test to invoke the correct code.

        Show
        ASF subversion and git services added a comment - Commit ab6779c30c0308b5905e894c74fce24f6aca7679 in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=ab6779c ] ACCUMULO-1637 Update the HDFS configuration check in VolumnManagerImpl from the one in TabletServer. Looks like I missed the movement of the configuration validation code in 1.6.0 from TabletServer to VolumeManagerImpl. Updated the check, and fixed the unit test to invoke the correct code.
        Hide
        Josh Elser added a comment -

        Found that the previous check is also in the code run during `accumulo init`

        Show
        Josh Elser added a comment - Found that the previous check is also in the code run during `accumulo init`
        Hide
        ASF subversion and git services added a comment -

        Commit 957676e681c46044e9b46f172b8ab96bfdc6715a in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=957676e ]

        ACCUMULO-1637 Removing some bad tests that aren't portable across hadoop
        versions.

        Show
        ASF subversion and git services added a comment - Commit 957676e681c46044e9b46f172b8ab96bfdc6715a in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=957676e ] ACCUMULO-1637 Removing some bad tests that aren't portable across hadoop versions.
        Hide
        ASF subversion and git services added a comment -

        Commit 957676e681c46044e9b46f172b8ab96bfdc6715a in branch refs/heads/1.5.1-SNAPSHOT from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=957676e ]

        ACCUMULO-1637 Removing some bad tests that aren't portable across hadoop
        versions.

        Show
        ASF subversion and git services added a comment - Commit 957676e681c46044e9b46f172b8ab96bfdc6715a in branch refs/heads/1.5.1-SNAPSHOT from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=957676e ] ACCUMULO-1637 Removing some bad tests that aren't portable across hadoop versions.
        Hide
        Josh Elser added a comment -

        Borked a test

        Show
        Josh Elser added a comment - Borked a test
        Hide
        ASF subversion and git services added a comment -

        Commit e498ebb802f7f07827b8f0c0a067f8536322b692 in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=e498ebb ]

        Merge branch '1.5.1-SNAPSHOT'

        Conflicts:
        server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java

        ACCUMULO-1637 Expanding the tests to account for the VolumeManager
        changes in 1.6

        Show
        ASF subversion and git services added a comment - Commit e498ebb802f7f07827b8f0c0a067f8536322b692 in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=e498ebb ] Merge branch '1.5.1-SNAPSHOT' Conflicts: server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java ACCUMULO-1637 Expanding the tests to account for the VolumeManager changes in 1.6
        Hide
        ASF subversion and git services added a comment -

        Commit 0987628299165740230de132f85b3c63789ad584 in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=0987628 ]

        ACCUMULO-1637 Add tests for append/sync conf check

        Added in a few tests, given a hadoop-1.0.4 dependency, that make sure we
        fail when appropriate. Had to change the access control on the method in
        TabletServer as well as throwing an exception instead of directly
        calling System.exit() to make a more easily testable method.

        Show
        ASF subversion and git services added a comment - Commit 0987628299165740230de132f85b3c63789ad584 in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=0987628 ] ACCUMULO-1637 Add tests for append/sync conf check Added in a few tests, given a hadoop-1.0.4 dependency, that make sure we fail when appropriate. Had to change the access control on the method in TabletServer as well as throwing an exception instead of directly calling System.exit() to make a more easily testable method.
        Hide
        ASF subversion and git services added a comment -

        Commit 0987628299165740230de132f85b3c63789ad584 in branch refs/heads/1.5.1-SNAPSHOT from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=0987628 ]

        ACCUMULO-1637 Add tests for append/sync conf check

        Added in a few tests, given a hadoop-1.0.4 dependency, that make sure we
        fail when appropriate. Had to change the access control on the method in
        TabletServer as well as throwing an exception instead of directly
        calling System.exit() to make a more easily testable method.

        Show
        ASF subversion and git services added a comment - Commit 0987628299165740230de132f85b3c63789ad584 in branch refs/heads/1.5.1-SNAPSHOT from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=0987628 ] ACCUMULO-1637 Add tests for append/sync conf check Added in a few tests, given a hadoop-1.0.4 dependency, that make sure we fail when appropriate. Had to change the access control on the method in TabletServer as well as throwing an exception instead of directly calling System.exit() to make a more easily testable method.
        Hide
        ASF subversion and git services added a comment -

        Commit 18bcd14ca8020e969b9881c44cc3c565471cae5b in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=18bcd14 ]

        ACCUMULO-1637 Missed a usage of the DFSConfigKeys constant that might
        not always be present.

        Show
        ASF subversion and git services added a comment - Commit 18bcd14ca8020e969b9881c44cc3c565471cae5b in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=18bcd14 ] ACCUMULO-1637 Missed a usage of the DFSConfigKeys constant that might not always be present.
        Hide
        ASF subversion and git services added a comment -

        Commit 18bcd14ca8020e969b9881c44cc3c565471cae5b in branch refs/heads/1.5.1-SNAPSHOT from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=18bcd14 ]

        ACCUMULO-1637 Missed a usage of the DFSConfigKeys constant that might
        not always be present.

        Show
        ASF subversion and git services added a comment - Commit 18bcd14ca8020e969b9881c44cc3c565471cae5b in branch refs/heads/1.5.1-SNAPSHOT from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=18bcd14 ] ACCUMULO-1637 Missed a usage of the DFSConfigKeys constant that might not always be present.
        Hide
        ASF subversion and git services added a comment -

        Commit 685cc4a70d87fd7eda56538fb381a8347b4e56e3 in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=685cc4a ]

        ACCUMULO-1637 Lifting the ticket message into its own string constant

        Show
        ASF subversion and git services added a comment - Commit 685cc4a70d87fd7eda56538fb381a8347b4e56e3 in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=685cc4a ] ACCUMULO-1637 Lifting the ticket message into its own string constant
        Hide
        ASF subversion and git services added a comment -

        Commit d1243aafc7dfc9d14ccbe7d67f92055e26228221 in branch refs/heads/master from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=d1243aa ]

        ACCUMULO-1637 Rework the hadoop append/sync checks trying to match what
        Hadoop is doing internally by default.

        Show
        ASF subversion and git services added a comment - Commit d1243aafc7dfc9d14ccbe7d67f92055e26228221 in branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=d1243aa ] ACCUMULO-1637 Rework the hadoop append/sync checks trying to match what Hadoop is doing internally by default.
        Hide
        ASF subversion and git services added a comment -

        Commit 685cc4a70d87fd7eda56538fb381a8347b4e56e3 in branch refs/heads/1.5.1-SNAPSHOT from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=685cc4a ]

        ACCUMULO-1637 Lifting the ticket message into its own string constant

        Show
        ASF subversion and git services added a comment - Commit 685cc4a70d87fd7eda56538fb381a8347b4e56e3 in branch refs/heads/1.5.1-SNAPSHOT from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=685cc4a ] ACCUMULO-1637 Lifting the ticket message into its own string constant
        Hide
        ASF subversion and git services added a comment -

        Commit d1243aafc7dfc9d14ccbe7d67f92055e26228221 in branch refs/heads/1.5.1-SNAPSHOT from Josh Elser
        [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=d1243aa ]

        ACCUMULO-1637 Rework the hadoop append/sync checks trying to match what
        Hadoop is doing internally by default.

        Show
        ASF subversion and git services added a comment - Commit d1243aafc7dfc9d14ccbe7d67f92055e26228221 in branch refs/heads/1.5.1-SNAPSHOT from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=d1243aa ] ACCUMULO-1637 Rework the hadoop append/sync checks trying to match what Hadoop is doing internally by default.
        Hide
        Josh Elser added a comment -

        Good enough. I'll see if I can get back to this later and add some sort of test for the method as well.

        Show
        Josh Elser added a comment - Good enough. I'll see if I can get back to this later and add some sort of test for the method as well.
        Hide
        Eric Newton added a comment -

        Just commit it; this is a much better check than what is there now.

        And hoist the constant " See ACCUMULO-623 and ACCUMULO-1637 for more details.".

        Show
        Eric Newton added a comment - Just commit it; this is a much better check than what is there now. And hoist the constant " See ACCUMULO-623 and ACCUMULO-1637 for more details.".
        Hide
        Josh Elser added a comment -

        Initial stab at reworking the append/sync checks for what Accumulo actually needs.

        Show
        Josh Elser added a comment - Initial stab at reworking the append/sync checks for what Accumulo actually needs.
        Hide
        Eric Newton added a comment -

        We could reflect on DFSConfigKeys (handling the absence of DFS_SUPPORT_APPEND_DEFAULT for the 1.1.x line) and then check for the presence of dfs.support.append being true in the Hadoop conf when DFS_SUPPORT_APPEND_DEFAULT is false.

        I think that's reasonable. Thanks for digging into the version/sync details.

        Show
        Eric Newton added a comment - We could reflect on DFSConfigKeys (handling the absence of DFS_SUPPORT_APPEND_DEFAULT for the 1.1.x line) and then check for the presence of dfs.support.append being true in the Hadoop conf when DFS_SUPPORT_APPEND_DEFAULT is false. I think that's reasonable. Thanks for digging into the version/sync details.
        Hide
        Josh Elser added a comment - - edited

        The constants in DFSConfigKeys aren't stable between versions.

        Right, that was the point of making the table in the description (as I figured was likely the case).

        Show
        Josh Elser added a comment - - edited The constants in DFSConfigKeys aren't stable between versions. Right, that was the point of making the table in the description (as I figured was likely the case).
        Hide
        Josh Elser added a comment -

        John Vines If you have specifics, that'd be awesome.

        Eric Newton Do you know of a means to reliably check that?

        Alright, I think I get it now (someone correct me if I'm wrong). Brain-dumping before I lose it.

        0.20.205.0 and 1.0.x need the parameter 'dfs.support.append' as a means to make sure sync actually works (misnomer on the configuration parameter). 1.1.x (and thus 1.2.x as well) remove this misnomer and introduced 'dfs.durable.sync' as a means for users to turn off sync when they don't want it (irrelevant for us). If you have dfs.support.append defined in 1.1.. Hadoop2 variants (0.23 and 2.x) all support append properly and sync implicitly.

        In other words: 0.20.205.0 and 1.0.x must have the 'dfs.support.append' parameter provided to ensure sync occurs when we call it. >=1.1.0 and >=0.23 must not have dfs.durable.sync=false configured (although, realistically, we probably don't ever want to see that).

        Personally, if someone is still running 0.20.205.0 or 1.0.x, I'm tempted to just say "shame on you" through documentation rather than trying to make a runtime check. We could reflect on DFSConfigKeys (handling the absence of DFS_SUPPORT_APPEND_DEFAULT for the 1.1.x line) and then check for the presence of dfs.support.append being true in the Hadoop conf when DFS_SUPPORT_APPEND_DEFAULT is false. We should probably always check that dfs.durable.sync is absent or true when configured. Thoughts?

        Show
        Josh Elser added a comment - John Vines If you have specifics, that'd be awesome. Eric Newton Do you know of a means to reliably check that? Alright, I think I get it now (someone correct me if I'm wrong). Brain-dumping before I lose it. 0.20.205.0 and 1.0.x need the parameter 'dfs.support.append' as a means to make sure sync actually works (misnomer on the configuration parameter). 1.1.x (and thus 1.2.x as well) remove this misnomer and introduced 'dfs.durable.sync' as a means for users to turn off sync when they don't want it (irrelevant for us). If you have dfs.support.append defined in 1.1.. Hadoop2 variants (0.23 and 2.x) all support append properly and sync implicitly. In other words: 0.20.205.0 and 1.0.x must have the 'dfs.support.append' parameter provided to ensure sync occurs when we call it. >=1.1.0 and >=0.23 must not have dfs.durable.sync=false configured (although, realistically, we probably don't ever want to see that). Personally, if someone is still running 0.20.205.0 or 1.0.x, I'm tempted to just say "shame on you" through documentation rather than trying to make a runtime check. We could reflect on DFSConfigKeys (handling the absence of DFS_SUPPORT_APPEND_DEFAULT for the 1.1.x line) and then check for the presence of dfs.support.append being true in the Hadoop conf when DFS_SUPPORT_APPEND_DEFAULT is false. We should probably always check that dfs.durable.sync is absent or true when configured. Thoughts?
        Hide
        Christopher Tubbs added a comment -

        The constants in DFSConfigKeys aren't stable between versions.

        Show
        Christopher Tubbs added a comment - The constants in DFSConfigKeys aren't stable between versions.
        Hide
        Josh Elser added a comment -

        Currently trying to wrangle an "accurate" view on the Apache Hadoops. I don't think I'll be able to make an assessment on what I think we should do until I have a complete picture.

        I'm hoping that DFSConfigKeys will be a uniform place I can look for the defaults (or it will at least match what the code might do implicitly, which I found one instance of already).

        Show
        Josh Elser added a comment - Currently trying to wrangle an "accurate" view on the Apache Hadoops. I don't think I'll be able to make an assessment on what I think we should do until I have a complete picture. I'm hoping that DFSConfigKeys will be a uniform place I can look for the defaults (or it will at least match what the code might do implicitly, which I found one instance of already).
        Hide
        Eric Newton added a comment -

        We could just check for dfs.durable.sync first.

        Show
        Eric Newton added a comment - We could just check for dfs.durable.sync first.
        Hide
        John Vines added a comment -

        One of the releases since 1.04 added a centralized location for default
        properties that you may be able to use.

        Sent from my phone, please pardon the typos and brevity.

        Show
        John Vines added a comment - One of the releases since 1.04 added a centralized location for default properties that you may be able to use. Sent from my phone, please pardon the typos and brevity.

          People

          • Assignee:
            Josh Elser
            Reporter:
            Josh Elser
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development