Flume
  1. Flume
  2. FLUME-937

Make Flume compile against Hadoop 0.23

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v0.9.5
    • Fix Version/s: v0.9.5
    • Component/s: Build, Sinks+Sources
    • Labels:
      None
    1. Flume-937.patch
      19 kB
      Prasad Mujumdar

      Activity

      Prasad Mujumdar created issue -
      Hide
      jiraposter@reviews.apache.org added a comment -

      -----------------------------------------------------------
      This is an automatically generated e-mail. To reply, visit:
      https://reviews.apache.org/r/3599/
      -----------------------------------------------------------

      Review request for Bruno Mahé and jmhsieh.

      Summary
      -------

      1. Add new maven profile to compile with Hadoop 0.23
      2. Exclude the variants of sequence file which are not compiling with 0.23
      3. Replace the deprecated HDFS interface (getCompressionType)

      This addresses bug Flume-937.
      https://issues.apache.org/jira/browse/Flume-937

      Diffs


      flume-core/pom.xml 4a09812
      flume-core/src/main/java/com/cloudera/flume/handlers/hdfs/SeqfileEventSink.java b88c292
      flume-core/src/main/java/com/cloudera/flume/handlers/seqfile/SequenceFileOutputFormat.java e6d321f
      flume-core/src/test/java/com/cloudera/flume/agent/durability/TestNaiveFileWALManager.java ce42c37
      pom.xml 1ac9aaa

      Diff: https://reviews.apache.org/r/3599/diff

      Testing
      -------

      Full test run.
      Still have two failures, one is intermittent and other looks like hadoop bug. will complete the analysis.

      Thanks,

      Prasad

      Show
      jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3599/ ----------------------------------------------------------- Review request for Bruno Mahé and jmhsieh. Summary ------- 1. Add new maven profile to compile with Hadoop 0.23 2. Exclude the variants of sequence file which are not compiling with 0.23 3. Replace the deprecated HDFS interface (getCompressionType) This addresses bug Flume-937. https://issues.apache.org/jira/browse/Flume-937 Diffs flume-core/pom.xml 4a09812 flume-core/src/main/java/com/cloudera/flume/handlers/hdfs/SeqfileEventSink.java b88c292 flume-core/src/main/java/com/cloudera/flume/handlers/seqfile/SequenceFileOutputFormat.java e6d321f flume-core/src/test/java/com/cloudera/flume/agent/durability/TestNaiveFileWALManager.java ce42c37 pom.xml 1ac9aaa Diff: https://reviews.apache.org/r/3599/diff Testing ------- Full test run. Still have two failures, one is intermittent and other looks like hadoop bug. will complete the analysis. Thanks, Prasad
      Hide
      Tom White added a comment -

      How did SequenceFile change between 0.20 and 0.23? If possible, it would be preferable to make the changes in Hadoop if there are compatibility issues that need fixing.

      Also, you should depend on Apache Hadoop versions.

      Show
      Tom White added a comment - How did SequenceFile change between 0.20 and 0.23? If possible, it would be preferable to make the changes in Hadoop if there are compatibility issues that need fixing. Also, you should depend on Apache Hadoop versions.
      Hide
      Prasad Mujumdar added a comment -

      @Tom, thanks for taking a look.

      Flume 0.9x has a couple of wrapper classes on top of SequenceFile, mainly to simulate flush semantics in code that was originally developed with hadoop 0.18. These classes have code directly copied from hadoop SequenceFile.java that calls internal private methods which are not supported in 0.23 anymore. The patch has removed those files from the build and replaced the calls with direct SequenceFile methods.

      I will make change the versions to Apache hadoop.

      Show
      Prasad Mujumdar added a comment - @Tom, thanks for taking a look. Flume 0.9x has a couple of wrapper classes on top of SequenceFile, mainly to simulate flush semantics in code that was originally developed with hadoop 0.18. These classes have code directly copied from hadoop SequenceFile.java that calls internal private methods which are not supported in 0.23 anymore. The patch has removed those files from the build and replaced the calls with direct SequenceFile methods. I will make change the versions to Apache hadoop.
      Hide
      Tom White added a comment -

      Thanks for the explanation, Prasad. Are the SequenceFile variants needed in NG? I assume not, in which case it's probably OK to keep them in Flume in the 0.9 branch.

      BTW for the Maven profile you might have a look at other Maven-based projects (MAHOUT-822, HBase) which support 20 and 23 so that the flags are consistent at least.

      Show
      Tom White added a comment - Thanks for the explanation, Prasad. Are the SequenceFile variants needed in NG? I assume not, in which case it's probably OK to keep them in Flume in the 0.9 branch. BTW for the Maven profile you might have a look at other Maven-based projects ( MAHOUT-822 , HBase) which support 20 and 23 so that the flags are consistent at least.
      Hide
      Prasad Mujumdar added a comment -

      Correct, it's specific to 0.9 branch. The NG doesn't use that.

      Sure, will try to keep the flags consistent.

      Show
      Prasad Mujumdar added a comment - Correct, it's specific to 0.9 branch. The NG doesn't use that. Sure, will try to keep the flags consistent.
      Hide
      jiraposter@reviews.apache.org added a comment -

      -----------------------------------------------------------
      This is an automatically generated e-mail. To reply, visit:
      https://reviews.apache.org/r/3599/#review4618
      -----------------------------------------------------------

      Ship it!

      Hey Prasad, mostly nits / comment requests for more 'whys' in code. Please add and feel free tocommit.

      Which tests failed? If require significant changes, please resubmit. If they are minor, this essentially lgtm.

      flume-core/pom.xml
      <https://reviews.apache.org/r/3599/#comment10276>

      Is here a reason why there is only one profile here but 2 in the other?

      as a non-maven expert, is any way to consolidate the two (i thought the subprojects can inherit properties.)

      flume-core/pom.xml
      <https://reviews.apache.org/r/3599/#comment10273>

      nit: indents

      flume-core/src/main/java/com/cloudera/flume/handlers/hdfs/SeqfileEventSink.java
      <https://reviews.apache.org/r/3599/#comment10277>

      This is wonky code – I can understand why it is needed with the context of the JIRA but without it I can't tell. Can you add comments about why this is necessary and a reference to FLUME-937? This seems to be the right place to explain it.

      flume-core/src/main/java/com/cloudera/flume/handlers/hdfs/SeqfileEventSink.java
      <https://reviews.apache.org/r/3599/#comment10278>

      nit: extra lines.

      pom.xml
      <https://reviews.apache.org/r/3599/#comment10274>

      File an issue to make compile against apache 1.0.0 hadoop?

      pom.xml
      <https://reviews.apache.org/r/3599/#comment10275>

      File an issue to make it compile against apache hadoop 0.23.0?

      pom.xml
      <https://reviews.apache.org/r/3599/#comment10272>

      nit: indents

      • jmhsieh

      On 2012-01-24 07:12:54, Prasad Mujumdar wrote:

      -----------------------------------------------------------

      This is an automatically generated e-mail. To reply, visit:

      https://reviews.apache.org/r/3599/

      -----------------------------------------------------------

      (Updated 2012-01-24 07:12:54)

      Review request for Bruno Mahé and jmhsieh.

      Summary

      -------

      1. Add new maven profile to compile with Hadoop 0.23

      2. Exclude the variants of sequence file which are not compiling with 0.23

      3. Replace the deprecated HDFS interface (getCompressionType)

      This addresses bug Flume-937.

      https://issues.apache.org/jira/browse/Flume-937

      Diffs

      -----

      flume-core/pom.xml 4a09812

      flume-core/src/main/java/com/cloudera/flume/handlers/hdfs/SeqfileEventSink.java b88c292

      flume-core/src/main/java/com/cloudera/flume/handlers/seqfile/SequenceFileOutputFormat.java e6d321f

      flume-core/src/test/java/com/cloudera/flume/agent/durability/TestNaiveFileWALManager.java ce42c37

      pom.xml 1ac9aaa

      Diff: https://reviews.apache.org/r/3599/diff

      Testing

      -------

      Full test run.

      Still have two failures, one is intermittent and other looks like hadoop bug. will complete the analysis.

      Thanks,

      Prasad

      Show
      jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3599/#review4618 ----------------------------------------------------------- Ship it! Hey Prasad, mostly nits / comment requests for more 'whys' in code. Please add and feel free tocommit. Which tests failed? If require significant changes, please resubmit. If they are minor, this essentially lgtm. flume-core/pom.xml < https://reviews.apache.org/r/3599/#comment10276 > Is here a reason why there is only one profile here but 2 in the other? as a non-maven expert, is any way to consolidate the two (i thought the subprojects can inherit properties.) flume-core/pom.xml < https://reviews.apache.org/r/3599/#comment10273 > nit: indents flume-core/src/main/java/com/cloudera/flume/handlers/hdfs/SeqfileEventSink.java < https://reviews.apache.org/r/3599/#comment10277 > This is wonky code – I can understand why it is needed with the context of the JIRA but without it I can't tell. Can you add comments about why this is necessary and a reference to FLUME-937 ? This seems to be the right place to explain it. flume-core/src/main/java/com/cloudera/flume/handlers/hdfs/SeqfileEventSink.java < https://reviews.apache.org/r/3599/#comment10278 > nit: extra lines. pom.xml < https://reviews.apache.org/r/3599/#comment10274 > File an issue to make compile against apache 1.0.0 hadoop? pom.xml < https://reviews.apache.org/r/3599/#comment10275 > File an issue to make it compile against apache hadoop 0.23.0? pom.xml < https://reviews.apache.org/r/3599/#comment10272 > nit: indents jmhsieh On 2012-01-24 07:12:54, Prasad Mujumdar wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3599/ ----------------------------------------------------------- (Updated 2012-01-24 07:12:54) Review request for Bruno Mahé and jmhsieh. Summary ------- 1. Add new maven profile to compile with Hadoop 0.23 2. Exclude the variants of sequence file which are not compiling with 0.23 3. Replace the deprecated HDFS interface (getCompressionType) This addresses bug Flume-937. https://issues.apache.org/jira/browse/Flume-937 Diffs ----- flume-core/pom.xml 4a09812 flume-core/src/main/java/com/cloudera/flume/handlers/hdfs/SeqfileEventSink.java b88c292 flume-core/src/main/java/com/cloudera/flume/handlers/seqfile/SequenceFileOutputFormat.java e6d321f flume-core/src/test/java/com/cloudera/flume/agent/durability/TestNaiveFileWALManager.java ce42c37 pom.xml 1ac9aaa Diff: https://reviews.apache.org/r/3599/diff Testing ------- Full test run. Still have two failures, one is intermittent and other looks like hadoop bug. will complete the analysis. Thanks, Prasad
      Hide
      jiraposter@reviews.apache.org added a comment -

      On 2012-01-26 01:33:14, jmhsieh wrote:

      > Hey Prasad, mostly nits / comment requests for more 'whys' in code. Please add and feel free tocommit.

      >

      > Which tests failed? If require significant changes, please resubmit. If they are minor, this essentially lgtm.

      Thanks for the review Jon !

      Addressed the space/indentation nits
      Updated the pom to pickup the hadoop dependencies from apache repo
      added comments in SeqfileEventSink.java

      The failed test seems to be a 0.23 bug, I will go ahead and log a jira and also one for hadoop 1.0 compilation.

      • Prasad

      -----------------------------------------------------------
      This is an automatically generated e-mail. To reply, visit:
      https://reviews.apache.org/r/3599/#review4618
      -----------------------------------------------------------

      On 2012-01-24 07:12:54, Prasad Mujumdar wrote:

      -----------------------------------------------------------

      This is an automatically generated e-mail. To reply, visit:

      https://reviews.apache.org/r/3599/

      -----------------------------------------------------------

      (Updated 2012-01-24 07:12:54)

      Review request for Bruno Mahé and jmhsieh.

      Summary

      -------

      1. Add new maven profile to compile with Hadoop 0.23

      2. Exclude the variants of sequence file which are not compiling with 0.23

      3. Replace the deprecated HDFS interface (getCompressionType)

      This addresses bug Flume-937.

      https://issues.apache.org/jira/browse/Flume-937

      Diffs

      -----

      flume-core/pom.xml 4a09812

      flume-core/src/main/java/com/cloudera/flume/handlers/hdfs/SeqfileEventSink.java b88c292

      flume-core/src/main/java/com/cloudera/flume/handlers/seqfile/SequenceFileOutputFormat.java e6d321f

      flume-core/src/test/java/com/cloudera/flume/agent/durability/TestNaiveFileWALManager.java ce42c37

      pom.xml 1ac9aaa

      Diff: https://reviews.apache.org/r/3599/diff

      Testing

      -------

      Full test run.

      Still have two failures, one is intermittent and other looks like hadoop bug. will complete the analysis.

      Thanks,

      Prasad

      Show
      jiraposter@reviews.apache.org added a comment - On 2012-01-26 01:33:14, jmhsieh wrote: > Hey Prasad, mostly nits / comment requests for more 'whys' in code. Please add and feel free tocommit. > > Which tests failed? If require significant changes, please resubmit. If they are minor, this essentially lgtm. Thanks for the review Jon ! Addressed the space/indentation nits Updated the pom to pickup the hadoop dependencies from apache repo added comments in SeqfileEventSink.java The failed test seems to be a 0.23 bug, I will go ahead and log a jira and also one for hadoop 1.0 compilation. Prasad ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3599/#review4618 ----------------------------------------------------------- On 2012-01-24 07:12:54, Prasad Mujumdar wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3599/ ----------------------------------------------------------- (Updated 2012-01-24 07:12:54) Review request for Bruno Mahé and jmhsieh. Summary ------- 1. Add new maven profile to compile with Hadoop 0.23 2. Exclude the variants of sequence file which are not compiling with 0.23 3. Replace the deprecated HDFS interface (getCompressionType) This addresses bug Flume-937. https://issues.apache.org/jira/browse/Flume-937 Diffs ----- flume-core/pom.xml 4a09812 flume-core/src/main/java/com/cloudera/flume/handlers/hdfs/SeqfileEventSink.java b88c292 flume-core/src/main/java/com/cloudera/flume/handlers/seqfile/SequenceFileOutputFormat.java e6d321f flume-core/src/test/java/com/cloudera/flume/agent/durability/TestNaiveFileWALManager.java ce42c37 pom.xml 1ac9aaa Diff: https://reviews.apache.org/r/3599/diff Testing ------- Full test run. Still have two failures, one is intermittent and other looks like hadoop bug. will complete the analysis. Thanks, Prasad
      Prasad Mujumdar made changes -
      Field Original Value New Value
      Attachment Flume-937.patch [ 12512617 ]
      Hide
      Prasad Mujumdar added a comment -

      Patch attached

      Show
      Prasad Mujumdar added a comment - Patch attached
      Prasad Mujumdar made changes -
      Status Open [ 1 ] Patch Available [ 10002 ]
      Hide
      Prasad Mujumdar added a comment -

      Patch committed to trunk.
      There's a new maven profile hadoop-0.23 that needs to be activated when compiling with hadoop 0.23.

      Show
      Prasad Mujumdar added a comment - Patch committed to trunk. There's a new maven profile hadoop-0.23 that needs to be activated when compiling with hadoop 0.23.
      Prasad Mujumdar made changes -
      Status Patch Available [ 10002 ] Resolved [ 5 ]
      Resolution Fixed [ 1 ]

        People

        • Assignee:
          Prasad Mujumdar
          Reporter:
          Prasad Mujumdar
        • Votes:
          0 Vote for this issue
          Watchers:
          1 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development