HBase
  1. HBase
  2. HBASE-6962

Upgrade hadoop 1 dependency to hadoop 1.1

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.95.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      hadoop 1.1 contains multiple important fixes, including HDFS-3703

    • Release Note:
      Upgrade hadoop 1 dependency to hadoop 1.1.0
    1. 6962.txt
      0.5 kB
      Ted Yu

      Activity

      Hide
      stack added a comment -

      Marking closed.

      Show
      stack added a comment - Marking closed.
      Hide
      Lars Hofhansl added a comment -

      Comment crossing

      Show
      Lars Hofhansl added a comment - Comment crossing
      Hide
      Enis Soztutar added a comment -

      HBase does not use append, but needed the append to be enabled to have the sync API (hflush, hsync). From what I understand, HADOOP-8230 decouples append and sync support, and enables sync by default.

      However, HBase still uses hflush() by default, and it is not changed. In HBASE-5954, we will have the option to chose between hflush/hsync depending on desired durability guarantees per column family/table.

      See Lars' excellent blog post on this:
      http://hadoop-hbase.blogspot.com/2012/05/hbase-hdfs-and-durable-sync.html

      Show
      Enis Soztutar added a comment - HBase does not use append, but needed the append to be enabled to have the sync API (hflush, hsync). From what I understand, HADOOP-8230 decouples append and sync support, and enables sync by default. However, HBase still uses hflush() by default, and it is not changed. In HBASE-5954 , we will have the option to chose between hflush/hsync depending on desired durability guarantees per column family/table. See Lars' excellent blog post on this: http://hadoop-hbase.blogspot.com/2012/05/hbase-hdfs-and-durable-sync.html
      Hide
      Lars Hofhansl added a comment -

      While 0.94 can be build against Hadoop 1.1.0 the behavior will be the same.
      There is a lot of confusion about append, sync, hsync, hflush. Let me try to clarify.

      1. HBase never needed append, but only the sync part of the 0.20 append branch.
      2. Until HDFS-744 Hadoop did not have any durable sync. hsync was identical to hflush
      3. When we talk about sync in HBase we always mean hflush (until HBASE-5954 is done, that is)

      That means as far as this issue is concerned, you can safely switch to Hadoop 1.1.0.

      (I tried to summarize this here: http://hadoop-hbase.blogspot.com/2012/05/hbase-hdfs-and-durable-sync.html)

      Show
      Lars Hofhansl added a comment - While 0.94 can be build against Hadoop 1.1.0 the behavior will be the same. There is a lot of confusion about append, sync, hsync, hflush. Let me try to clarify. HBase never needed append, but only the sync part of the 0.20 append branch. Until HDFS-744 Hadoop did not have any durable sync. hsync was identical to hflush When we talk about sync in HBase we always mean hflush (until HBASE-5954 is done, that is) That means as far as this issue is concerned, you can safely switch to Hadoop 1.1.0. (I tried to summarize this here: http://hadoop-hbase.blogspot.com/2012/05/hbase-hdfs-and-durable-sync.html )
      Hide
      Varun Sharma added a comment -

      Hey folks,

      I saw the hadoop 1.1.0 release notes and wanted to check on the Hadoop 8230 in particular. I am looking to try 1.1 with hbase 0.94 so as to get the hdfs stale node patches into the system. However, I wanted to check on the sync/append changes. Does this change mean that hsync is enabled by default and is going to be used for HBase append operation instead of the previous hflush implementation ? From my understanding, this would be a performance cost (persisting to disk vs writing to OS buffers) ?

      Thanks
      Varun

      Show
      Varun Sharma added a comment - Hey folks, I saw the hadoop 1.1.0 release notes and wanted to check on the Hadoop 8230 in particular. I am looking to try 1.1 with hbase 0.94 so as to get the hdfs stale node patches into the system. However, I wanted to check on the sync/append changes. Does this change mean that hsync is enabled by default and is going to be used for HBase append operation instead of the previous hflush implementation ? From my understanding, this would be a performance cost (persisting to disk vs writing to OS buffers) ? Thanks Varun
      Hide
      Enis Soztutar added a comment - - edited

      You say, 'Although that Hadoop section needs a cleanup in general.' What would you suggest?

      We can create a matrix of hadoop version x hbase version to list what has been tested, and what is not actively supported. Something like:

      Hadoop    | 0.20-append | 1.0.x | 1.1.x | 0.23 | 2.0 |
      0.92      |  S          |   S   |  S    |  NT  | NT  |
      0.94      |  S          |   S   |  S    |  NT  | NT  |
      0.96      |  NS         |   S   |  S    |  S   |  S  |
      

      Where S = supported, NS = not supported, NT = not tested. The actual values above are from the top of my head, I might be wrong.

      @Stack agreed. Ted, could you please rise this issue on dev list. We can revert this if we decide to run tests with hadoop-1.0, although the reason we may want to go with this is that both 1.1 and 1.0 are labeled as stable, and an upgrade to 1.1 is the recommended (see Matt's mail thread).

      @Lars, 1.0.4 fixes a security issue, I think we should update in 0.94. No need to add a target, since -Dhadoop.version should suffice.

      Show
      Enis Soztutar added a comment - - edited You say, 'Although that Hadoop section needs a cleanup in general.' What would you suggest? We can create a matrix of hadoop version x hbase version to list what has been tested, and what is not actively supported. Something like: Hadoop | 0.20-append | 1.0.x | 1.1.x | 0.23 | 2.0 | 0.92 | S | S | S | NT | NT | 0.94 | S | S | S | NT | NT | 0.96 | NS | S | S | S | S | Where S = supported, NS = not supported, NT = not tested. The actual values above are from the top of my head, I might be wrong. @Stack agreed. Ted, could you please rise this issue on dev list. We can revert this if we decide to run tests with hadoop-1.0, although the reason we may want to go with this is that both 1.1 and 1.0 are labeled as stable, and an upgrade to 1.1 is the recommended (see Matt's mail thread). @Lars, 1.0.4 fixes a security issue, I think we should update in 0.94. No need to add a target, since -Dhadoop.version should suffice.
      Hide
      stack added a comment -

      Save it for posting to dev list I'd say Ted

      Show
      stack added a comment - Save it for posting to dev list I'd say Ted
      Hide
      Ted Yu added a comment -

      For reference, here is list of incompatible changes between hadoop 1.0 and 1.1:

      HDFS-2617. Replaced Kerberized SSL for image transfer and fsck with
      SPNEGO-based solution. (Jakob Homan, Owen O'Malley, Alejandro Abdelnur and
      Aaron T. Myers via atm)

      HDFS-3044. fsck move should be non-destructive by default.
      (Colin Patrick McCabe via eli)

      HADOOP-8230. Enable sync by default and disable append. (eli)

      HADOOP-8365. Provide ability to disable working sync. (eli)

      HADOOP-8552. Conflict: Same security.log.file for multiple users.
      (kkambatl via tucu)

      Show
      Ted Yu added a comment - For reference, here is list of incompatible changes between hadoop 1.0 and 1.1: HDFS-2617 . Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution. (Jakob Homan, Owen O'Malley, Alejandro Abdelnur and Aaron T. Myers via atm) HDFS-3044 . fsck move should be non-destructive by default. (Colin Patrick McCabe via eli) HADOOP-8230 . Enable sync by default and disable append. (eli) HADOOP-8365 . Provide ability to disable working sync. (eli) HADOOP-8552 . Conflict: Same security.log.file for multiple users. (kkambatl via tucu)
      Hide
      stack added a comment -

      @Enis You say, 'Although that Hadoop section needs a cleanup in general.' What would you suggest?

      @Lars We could upgrade yes.

      Anyone know if API differences between hadoop 1.1 and 1.0? Would it be better to ship w/ hadoop 1.0 since that is what we are saying is the minimum requirement for 0.96 and then recommend in doc. that folks run 1.1 because of the fixed issues and perf benefits?

      As has been suggested above, seems like this is something worthy of raising on dev list (even if this has gone in already, if only to raise consciousness).

      Show
      stack added a comment - @Enis You say, 'Although that Hadoop section needs a cleanup in general.' What would you suggest? @Lars We could upgrade yes. Anyone know if API differences between hadoop 1.1 and 1.0? Would it be better to ship w/ hadoop 1.0 since that is what we are saying is the minimum requirement for 0.96 and then recommend in doc. that folks run 1.1 because of the fixed issues and perf benefits? As has been suggested above, seems like this is something worthy of raising on dev list (even if this has gone in already, if only to raise consciousness).
      Hide
      Lars Hofhansl added a comment -

      Should we update the default for 0.94 to 1.0.4 (and add a target for 1.1.0)?

      Show
      Lars Hofhansl added a comment - Should we update the default for 0.94 to 1.0.4 (and add a target for 1.1.0)?
      Hide
      Hudson added a comment -

      Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #222 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/222/)
      HBASE-6962 Upgrade hadoop 1 dependency to hadoop 1.1 (Revision 1398580)

      Result = FAILURE
      enis :
      Files :

      • /hbase/trunk/pom.xml
      Show
      Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #222 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/222/ ) HBASE-6962 Upgrade hadoop 1 dependency to hadoop 1.1 (Revision 1398580) Result = FAILURE enis : Files : /hbase/trunk/pom.xml
      Hide
      Hudson added a comment -

      Integrated in HBase-TRUNK #3449 (See https://builds.apache.org/job/HBase-TRUNK/3449/)
      HBASE-6962 Upgrade hadoop 1 dependency to hadoop 1.1 (Revision 1398580)

      Result = FAILURE
      enis :
      Files :

      • /hbase/trunk/pom.xml
      Show
      Hudson added a comment - Integrated in HBase-TRUNK #3449 (See https://builds.apache.org/job/HBase-TRUNK/3449/ ) HBASE-6962 Upgrade hadoop 1 dependency to hadoop 1.1 (Revision 1398580) Result = FAILURE enis : Files : /hbase/trunk/pom.xml
      Hide
      Ted Yu added a comment -

      The thinking behind new recommendation is that future patches for MTTR may depend on the improvement unique to hadoop 1.1 (absent from hadoop 1.0.4)

      Show
      Ted Yu added a comment - The thinking behind new recommendation is that future patches for MTTR may depend on the improvement unique to hadoop 1.1 (absent from hadoop 1.0.4)
      Hide
      Enis Soztutar added a comment -

      From general@, it seems that both 1.0.4 and 1.1.0 are stable releases, but 1.1.0 includes a lot of performance improvements, and such. See, http://search-hadoop.com/m/0eTo41c8GSb.
      Not sure about why we would want to recommend 1.1 vs 1.0.x.
      At http://hbase.apache.org/book.html#basic.prerequisites, we state that 0.96 requires at least 1.0, but there is no requirement for 1.1 yet. Although that Hadoop section needs a cleanup in general.

      Show
      Enis Soztutar added a comment - From general@, it seems that both 1.0.4 and 1.1.0 are stable releases, but 1.1.0 includes a lot of performance improvements, and such. See, http://search-hadoop.com/m/0eTo41c8GSb . Not sure about why we would want to recommend 1.1 vs 1.0.x. At http://hbase.apache.org/book.html#basic.prerequisites , we state that 0.96 requires at least 1.0, but there is no requirement for 1.1 yet. Although that Hadoop section needs a cleanup in general.
      Hide
      Ted Yu added a comment -

      @Enis:
      Do you think we should start a thread on dev@hbase asking whether hadoop 1.1 should be the recommended hadoop release for hbase 0.96 ?

      Show
      Ted Yu added a comment - @Enis: Do you think we should start a thread on dev@hbase asking whether hadoop 1.1 should be the recommended hadoop release for hbase 0.96 ?
      Hide
      Enis Soztutar added a comment -

      +1. I've committed this. Thanks Ted.

      Show
      Enis Soztutar added a comment - +1. I've committed this. Thanks Ted.
      Hide
      Ted Yu added a comment -

      I ran TestMultiParallel and it passed:

      Running org.apache.hadoop.hbase.client.TestMultiParallel
      2012-10-15 10:12:46.310 java[45413:1903] Unable to load realm mapping info from SCDynamicStore
      Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 62.623 sec
      

      TestZooKeeperTableArchiveClient is covered in HBASE-6707

      Show
      Ted Yu added a comment - I ran TestMultiParallel and it passed: Running org.apache.hadoop.hbase.client.TestMultiParallel 2012-10-15 10:12:46.310 java[45413:1903] Unable to load realm mapping info from SCDynamicStore Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 62.623 sec TestZooKeeperTableArchiveClient is covered in HBASE-6707
      Hide
      Hadoop QA added a comment -

      -1 overall. Here are the results of testing the latest attachment
      http://issues.apache.org/jira/secure/attachment/12549105/6962.txt
      against trunk revision .

      +1 @author. The patch does not contain any @author tags.

      -1 tests included. The patch doesn't appear to include any new or modified tests.
      Please justify why no new tests are needed for this patch.
      Also please list what manual steps were performed to verify this patch.

      +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

      -1 javadoc. The javadoc tool appears to have generated 82 warning messages.

      +1 javac. The applied patch does not increase the total number of javac compiler warnings.

      -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings.

      +1 release audit. The applied patch does not increase the total number of release audit warnings.

      -1 core tests. The patch failed these unit tests:
      org.apache.hadoop.hbase.client.TestMultiParallel
      org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient

      Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//testReport/
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
      Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
      Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//console

      This message is automatically generated.

      Show
      Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12549105/6962.txt against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 82 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3051//console This message is automatically generated.
      Hide
      Nicolas Liochon added a comment -

      fwiw, I've run the unit tests on hbase trunk + hdfs branch 1.1 + HDFS-3912, no errors.

      Show
      Nicolas Liochon added a comment - fwiw, I've run the unit tests on hbase trunk + hdfs branch 1.1 + HDFS-3912 , no errors.

        People

        • Assignee:
          Ted Yu
          Reporter:
          Ted Yu
        • Votes:
          0 Vote for this issue
          Watchers:
          9 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development