Flume
  1. Flume
  2. FLUME-1618

Make Flume NG build and tests work with Hadoop 2.0 & Hbase 0.96

    Details

    • Type: Task Task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Build, Test
    • Labels:

      Description

      Add hadoop 2.0 support for Flume NG

      1. FLUME-1618-4.patch
        12 kB
        Hari Shreedharan
      2. FLUME-1618.v3.patch
        11 kB
        Roshan Naik
      3. FLUME-1618.v2.patch
        11 kB
        Roshan Naik
      4. FLUME-1618.patch
        12 kB
        Roshan Naik

        Issue Links

          Activity

          Hide
          Hari Shreedharan added a comment -

          Yes, I will update the patch with this info added.

          Show
          Hari Shreedharan added a comment - Yes, I will update the patch with this info added.
          Hide
          Roshan Naik added a comment -

          Wondering if it would be a good idea to add a note about this in the dev guide.

          Show
          Roshan Naik added a comment - Wondering if it would be a good idea to add a note about this in the dev guide.
          Hide
          Roshan Naik added a comment -

          Hari... please go ahead.

          Show
          Roshan Naik added a comment - Hari... please go ahead.
          Hari Shreedharan made changes -
          Attachment FLUME-1618-4.patch [ 12622355 ]
          Hari Shreedharan made changes -
          Attachment FLUME-1618-4.patch [ 12622353 ]
          Hari Shreedharan made changes -
          Attachment FLUME-1618-4.patch [ 12622353 ]
          Hide
          Hari Shreedharan added a comment -

          Minor fix to avoid a Hadoop test failure in the new profile - fix based on HBASE-8842

          Show
          Hari Shreedharan added a comment - Minor fix to avoid a Hadoop test failure in the new profile - fix based on HBASE-8842
          Hari Shreedharan made changes -
          Attachment FLUME-1618-4.patch [ 12622348 ]
          Hari Shreedharan made changes -
          Attachment FLUME-1618-4.patch [ 12622348 ]
          Hide
          Hari Shreedharan added a comment -

          Patch that allows Flume to build against HBase 96.

          Show
          Hari Shreedharan added a comment - Patch that allows Flume to build against HBase 96.
          Hide
          Hari Shreedharan added a comment -

          Roshan Naik - Please let me know if you are still working on this. If not, I will submit the patch.

          Show
          Hari Shreedharan added a comment - Roshan Naik - Please let me know if you are still working on this. If not, I will submit the patch.
          Hide
          Hari Shreedharan added a comment -

          I have a patch for this issue, which handles all of the current requirements. I will submit it soon.

          Show
          Hari Shreedharan added a comment - I have a patch for this issue, which handles all of the current requirements. I will submit it soon.
          Roshan Naik made changes -
          Summary Make Flume NG build and tests work with Hadoop 2.0 Make Flume NG build and tests work with Hadoop 2.0 & Hbase 0.96
          Hide
          Hari Shreedharan added a comment -

          Roshan Naik - Looks like most of the upstream issues regarding this have been fixed. Do you think you can update the patch to add hbase-95 support now? We should perhaps have 2 separate profiles for hbase too - looks like my suggestion above is not really the best if someone wants to do hadoop 1 + hbase-95 too.

          Show
          Hari Shreedharan added a comment - Roshan Naik - Looks like most of the upstream issues regarding this have been fixed. Do you think you can update the patch to add hbase-95 support now? We should perhaps have 2 separate profiles for hbase too - looks like my suggestion above is not really the best if someone wants to do hadoop 1 + hbase-95 too.
          Hide
          Hari Shreedharan added a comment -

          Another alternative is to create a new profile - hadoop2+hbase-95. We must make sure that only one of hadoop-1,hadoop-2 and hadoop2-hbase-95 is active at any point. Right now, specifying no hadoop.profile (or Phadoop*) will activate hadoop-1, we will need to ensure that if hadoop2-hbase-95 is activated, then hadoop-1 should not be - perhaps by using !(hadoop.profile || hadoop2-hbase-95) - whatever the syntax is.

          Show
          Hari Shreedharan added a comment - Another alternative is to create a new profile - hadoop2+hbase-95. We must make sure that only one of hadoop-1,hadoop-2 and hadoop2-hbase-95 is active at any point. Right now, specifying no hadoop.profile (or Phadoop *) will activate hadoop-1, we will need to ensure that if hadoop2-hbase-95 is activated, then hadoop-1 should not be - perhaps by using !(hadoop.profile || hadoop2-hbase-95) - whatever the syntax is.
          Hide
          Hari Shreedharan added a comment -

          Hi Roshan,

          Some comments:

          • Looks like you have merged the hadoop-2 profile with the hbase-95 one. Let's do this - create an hbase-95 profile, set a property in the hadoop profiles as a suffix to the hbase artifact name (suffix as hadoop-1 for hadoop-1 and hadoop-2 for hadoop-2). In the hbase-95 profile, set the version of hbase to be $ {hbase.artifact}

            95$

            {hadoop-suffix}

            .

          • For now we can leave the Asynchbase tests disabled in the hbase-95 profile.
          • The setWriteToWal to setDurability change needs to be done using reflection, else builds against hbase-94 will fail.
          • There is one additional change in the Syslog Source - that is not relevant to this jira. There is also a guava upgrade, let's do that in a different jira (I am +1 on upgrading Guava - just in a different jira)
          • Why make this change - the Apache staging repo is already in the pom:
            +    <repoid>apache.staging.https</repoid>
            +    <repourl>https://repository.apache.org/service/local/staging/deploy/maven2/</repourl>
            +    <reponame>Apache Staging Repository</reponame>
            +    <repo.maven.org>http://repo1.maven.org/maven2</repo.maven.org>
            +    <publicrepoid>public</publicrepoid>
            
          Show
          Hari Shreedharan added a comment - Hi Roshan, Some comments: Looks like you have merged the hadoop-2 profile with the hbase-95 one. Let's do this - create an hbase-95 profile, set a property in the hadoop profiles as a suffix to the hbase artifact name (suffix as hadoop-1 for hadoop-1 and hadoop-2 for hadoop-2). In the hbase-95 profile, set the version of hbase to be $ {hbase.artifact} 95 $ {hadoop-suffix} . For now we can leave the Asynchbase tests disabled in the hbase-95 profile. The setWriteToWal to setDurability change needs to be done using reflection, else builds against hbase-94 will fail. There is one additional change in the Syslog Source - that is not relevant to this jira. There is also a guava upgrade, let's do that in a different jira (I am +1 on upgrading Guava - just in a different jira) Why make this change - the Apache staging repo is already in the pom: + <repoid>apache.staging.https</repoid> + <repourl>https: //repository.apache.org/service/local/staging/deploy/maven2/</repourl> + <reponame>Apache Staging Repository</reponame> + <repo.maven.org>http: //repo1.maven.org/maven2</repo.maven.org> + <publicrepoid> public </publicrepoid>
          Mike Percy made changes -
          Fix Version/s v1.4.0 [ 12323372 ]
          Hide
          Roshan Naik added a comment -

          indeed.

          Show
          Roshan Naik added a comment - indeed.
          Hide
          Hari Shreedharan added a comment -

          Thanks Roshan. Makes sense. Can you please keep an eye out for the hadoop2 artifacts from HBase? We can come back to this when ready.

          Show
          Hari Shreedharan added a comment - Thanks Roshan. Makes sense. Can you please keep an eye out for the hadoop2 artifacts from HBase? We can come back to this when ready.
          Hide
          Roshan Naik added a comment -

          i have posted this patch here so that its easy to get back to it. I think we should wait until hbase actually publishes those hadoop2 binaries before considering this patch for commit.

          Show
          Roshan Naik added a comment - i have posted this patch here so that its easy to get back to it. I think we should wait until hbase actually publishes those hadoop2 binaries before considering this patch for commit.
          Roshan Naik made changes -
          Attachment FLUME-1618.v3.patch [ 12583412 ]
          Hide
          Roshan Naik added a comment -

          Incorporating Hari's feedback to renable the hbase async test

          Show
          Roshan Naik added a comment - Incorporating Hari's feedback to renable the hbase async test
          Hide
          Hari Shreedharan added a comment -

          At this point, we should not be disabling the async hbase sink tests (especially considering the fact that this is much better performing) and moving the minimum version to 0.95 - I think most installs of HBase are going to <=0.94.x. Perhaps that can be done in the future.

          Show
          Hari Shreedharan added a comment - At this point, we should not be disabling the async hbase sink tests (especially considering the fact that this is much better performing) and moving the minimum version to 0.95 - I think most installs of HBase are going to <=0.94.x. Perhaps that can be done in the future.
          Roshan Naik made changes -
          Attachment FLUME-1618.v2.patch [ 12583063 ]
          Roshan Naik made changes -
          Attachment FLUME-1618.v2.patch [ 12583026 ]
          Roshan Naik made changes -
          Attachment FLUME-1618.v2.patch [ 12583026 ]
          Hide
          Roshan Naik added a comment -

          patch v2:

          • Using Hbase 0.95 for Hadoop2 profile. Hbase dependencies have now been moved into the hadoop1 & hadoop2 profiles.
          • Bumping up version of Guava to 12.0.1 as HBase has moved forward and depends on it.. need some minor tweaks in flume source code to accomodate deprecations that have take effect in guava.
          • Async HBase client is no works due to incompatibilities introduced in HBase 0.95. So disabling the AsyncHBase tests for now.
          • HBase has not yet published the final Hadoop 2 artifacts. For flume we can build hbase locally and point the build to use local build using -Dhbase.version=0.95.0-SNAPSHOT
          Show
          Roshan Naik added a comment - patch v2: Using Hbase 0.95 for Hadoop2 profile. Hbase dependencies have now been moved into the hadoop1 & hadoop2 profiles. Bumping up version of Guava to 12.0.1 as HBase has moved forward and depends on it.. need some minor tweaks in flume source code to accomodate deprecations that have take effect in guava. Async HBase client is no works due to incompatibilities introduced in HBase 0.95. So disabling the AsyncHBase tests for now. HBase has not yet published the final Hadoop 2 artifacts. For flume we can build hbase locally and point the build to use local build using -Dhbase.version=0.95.0-SNAPSHOT
          Hide
          Hari Shreedharan added a comment -

          Yes. Hadoop does publish maven artifacts for hadoop-2.0.x releases - so that should be pulled in automatically. You can try it out yourself (run mvn clean install -Dhadoop.profile=2) - just make sure you take care of the local hbase build.

          Show
          Hari Shreedharan added a comment - Yes. Hadoop does publish maven artifacts for hadoop-2.0.x releases - so that should be pulled in automatically. You can try it out yourself (run mvn clean install -Dhadoop.profile=2) - just make sure you take care of the local hbase build.
          Hide
          Roshan Naik added a comment -

          The only other component with hadoop 2 dependency that i can think of is HDFS sink. is it known to work fine with hadoop 2 ?

          Is enabling the hadoop-2 profile sufficient that is needed to build Flume with hadoop2 (other than needing the special local hbase build) ?

          Show
          Roshan Naik added a comment - The only other component with hadoop 2 dependency that i can think of is HDFS sink. is it known to work fine with hadoop 2 ? Is enabling the hadoop-2 profile sufficient that is needed to build Flume with hadoop2 (other than needing the special local hbase build) ?
          Hide
          Hari Shreedharan added a comment -

          Well, you will still need to a local build of HBase with the hadoop 2 profile.

          Show
          Hari Shreedharan added a comment - Well, you will still need to a local build of HBase with the hadoop 2 profile.
          Hide
          Roshan Naik added a comment -

          has this issue has been addressed by FLUME-1653 & FLUME-1651 ?

          Show
          Roshan Naik added a comment - has this issue has been addressed by FLUME-1653 & FLUME-1651 ?
          Brock Noland made changes -
          Fix Version/s v1.4.0 [ 12323372 ]
          Fix Version/s v1.3.0 [ 12322140 ]
          Roshan Naik made changes -
          Attachment FLUME-1618.patch [ 12548506 ]
          Hide
          Roshan Naik added a comment -

          Hbase version may need to be updated once hbase binaries for hadoop2 are published (it appears likely 0.94.2 will have two separate binaries for hadoop2)

          Show
          Roshan Naik added a comment - Hbase version may need to be updated once hbase binaries for hadoop2 are published (it appears likely 0.94.2 will have two separate binaries for hadoop2)
          Roshan Naik made changes -
          Attachment FLUME-1618.draft.patch [ 12547811 ]
          Roshan Naik made changes -
          Assignee Bruno Mahé [ bmahe ] Roshan Naik [ roshan_naik ]
          Hide
          Hari Shreedharan added a comment -

          I had filed one long ago: https://issues.apache.org/jira/browse/HBASE-6020 - nothing on it yet.

          Show
          Hari Shreedharan added a comment - I had filed one long ago: https://issues.apache.org/jira/browse/HBASE-6020 - nothing on it yet.
          Roshan Naik made changes -
          Link This issue is blocked by HBASE-6929 [ HBASE-6929 ]
          Hide
          Roshan Naik added a comment -

          The HBase version (for hadoop2) is not really published anywhere as yet. I built Hbase locally using hadoop2.0 profile to validate this flume patch that I am working on. After discussing this issue with someone on the HBase side, there is now a jira filed against HBase to make a the hadoop2 version of HBase published (https://issues.apache.org/jira/browse/HBASE-6929).

          Show
          Roshan Naik added a comment - The HBase version (for hadoop2) is not really published anywhere as yet. I built Hbase locally using hadoop2.0 profile to validate this flume patch that I am working on. After discussing this issue with someone on the HBase side, there is now a jira filed against HBase to make a the hadoop2 version of HBase published ( https://issues.apache.org/jira/browse/HBASE-6929 ).
          Hide
          Hari Shreedharan added a comment -

          I have added you as a contributor to the project. If you are taking this over from Bruno, you can now reassign it to yourself

          Show
          Hari Shreedharan added a comment - I have added you as a contributor to the project. If you are taking this over from Bruno, you can now reassign it to yourself
          Hide
          Hari Shreedharan added a comment -

          Roshan - Thanks for the patch. Is this hbase version on maven central? Or is it custom compiled?

          Show
          Hari Shreedharan added a comment - Roshan - Thanks for the patch. Is this hbase version on maven central? Or is it custom compiled?
          Roshan Naik made changes -
          Attachment FLUME-1618.draft.patch [ 12547811 ]
          Hide
          Roshan Naik added a comment -

          Bruno, I have this draft patch .. working on some verifications. If you don't already have something ready could you can assign this to me ? This fix also depends on HBase compiled specifically against hadoop2.

          Show
          Roshan Naik added a comment - Bruno, I have this draft patch .. working on some verifications. If you don't already have something ready could you can assign this to me ? This fix also depends on HBase compiled specifically against hadoop2.
          Hide
          Roshan Naik added a comment -

          I have been looking into this lately. Let me know if someone else is already working on it.

          Show
          Roshan Naik added a comment - I have been looking into this lately. Let me know if someone else is already working on it.
          Roshan Naik made changes -
          Fix Version/s v1.3.0 [ 12322140 ]
          Fix Version/s v1.0.0 [ 12318896 ]
          Roshan Naik made changes -
          Component/s Build [ 12315318 ]
          Component/s Test [ 12315319 ]
          Roshan Naik made changes -
          Affects Version/s NG alpha 1 [ 12318440 ]
          Roshan Naik made changes -
          Description This task is about adding an experimental profile for Hadoop 0.23 Add hadoop 2.0 support for Flume NG
          Roshan Naik made changes -
          Field Original Value New Value
          Link This issue is a clone of FLUME-901 [ FLUME-901 ]
          Roshan Naik created issue -

            People

            • Assignee:
              Roshan Naik
              Reporter:
              Roshan Naik
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development