Flume
  1. Flume
  2. FLUME-1020

Implement Kerberos security for HDFS Sink

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v1.0.0
    • Fix Version/s: v1.2.0
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      Make flume HDFS sink work with secure clusters.

        Issue Links

          Activity

          Arvind Prabhakar created issue -
          Mike Percy made changes -
          Field Original Value New Value
          Assignee Mike Percy [ mpercy ]
          Arvind Prabhakar made changes -
          Summary Implement Kerberose security for HDFS Sink Implement Kerberos security for HDFS Sink
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4360/
          -----------------------------------------------------------

          Review request for Flume.

          Summary
          -------

          This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled.

          The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing.

          This addresses bug FLUME-1020.
          https://issues.apache.org/jira/browse/FLUME-1020

          Diffs


          flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e

          Diff: https://reviews.apache.org/r/4360/diff

          Testing
          -------

          Thanks,

          Mike

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4360/ ----------------------------------------------------------- Review request for Flume. Summary ------- This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled. The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing. This addresses bug FLUME-1020 . https://issues.apache.org/jira/browse/FLUME-1020 Diffs flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e Diff: https://reviews.apache.org/r/4360/diff Testing ------- Thanks, Mike
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4360/
          -----------------------------------------------------------

          (Updated 2012-03-17 02:30:51.378131)

          Review request for Flume.

          Changes
          -------

          Got this working. Please take a look.

          For now, I am not throwing if authentication fails.

          Also, based on speaking with folks familiar with HDFS, it turns out that in order to communicate with a secure cluster one must have the hadoop config directory on the classpath. This is due to some static variables being used to keep track of settings & states related to the UserGroupInformation class. So, I am looking for Hadoop environment variables in bin/flume-ng.

          Tested this against a Kerberized Hadoop cluster running in a VM (Centos 6) using MIT Kerberos on my laptop.

          Summary
          -------

          This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled.

          The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing.

          This addresses bug FLUME-1020.
          https://issues.apache.org/jira/browse/FLUME-1020

          Diffs (updated)


          bin/flume-ng 0796a5b
          flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e

          Diff: https://reviews.apache.org/r/4360/diff

          Testing
          -------

          Thanks,

          Mike

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4360/ ----------------------------------------------------------- (Updated 2012-03-17 02:30:51.378131) Review request for Flume. Changes ------- Got this working. Please take a look. For now, I am not throwing if authentication fails. Also, based on speaking with folks familiar with HDFS, it turns out that in order to communicate with a secure cluster one must have the hadoop config directory on the classpath. This is due to some static variables being used to keep track of settings & states related to the UserGroupInformation class. So, I am looking for Hadoop environment variables in bin/flume-ng. Tested this against a Kerberized Hadoop cluster running in a VM (Centos 6) using MIT Kerberos on my laptop. Summary ------- This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled. The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing. This addresses bug FLUME-1020 . https://issues.apache.org/jira/browse/FLUME-1020 Diffs (updated) bin/flume-ng 0796a5b flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e Diff: https://reviews.apache.org/r/4360/diff Testing ------- Thanks, Mike
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4360/
          -----------------------------------------------------------

          (Updated 2012-03-20 22:33:43.076249)

          Review request for Flume.

          Changes
          -------

          No java code has changed.

          Updated the build environment and the runtime environment as follows:

          1. At runtime, if the hadoop binary can be found on the system, Flume interrogates it to get the CLASSPATH and JAVA_LIBRARY_PATH variables out of it using tricks kindly shared by Roman in Bigtop. This allows us to find the hadoop configuration files and the appropriate JARs for the system being accessed. At least, it's basically the state of the art for compatibility right now if you want to call it that.
          2. To allow the tricks above to work at runtime, the hadoop artifacts have been marked as optional in the POM, which means that they will not be included in the binary distribution. That's fine, because they are only needed if the HDFS Sink is used, and we jump through hoops to find those artifacts if they're on the system.

          As a result, I am able to compile Flume against the default hadoop version (0.20.205.0) and run against versions of Hadoop that I didn't explicitly build against, like 0.23.x, without a problem. This is a huge improvement over when I was working on this last week, where I was adding/fixing profiles to get anything to work at all, since it's well known that different hadoop versions generally refuse to talk to each other.

          I also refactored the flume-ng script to duplicate less code and be a bit friendlier, since I was doing surgery in there anyway.

          This is ready for review now.

          Summary
          -------

          This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled.

          The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing.

          This addresses bug FLUME-1020.
          https://issues.apache.org/jira/browse/FLUME-1020

          Diffs (updated)


          bin/flume-ng 0796a5b
          flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf
          flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e

          Diff: https://reviews.apache.org/r/4360/diff

          Testing
          -------

          Thanks,

          Mike

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4360/ ----------------------------------------------------------- (Updated 2012-03-20 22:33:43.076249) Review request for Flume. Changes ------- No java code has changed. Updated the build environment and the runtime environment as follows: 1. At runtime, if the hadoop binary can be found on the system, Flume interrogates it to get the CLASSPATH and JAVA_LIBRARY_PATH variables out of it using tricks kindly shared by Roman in Bigtop. This allows us to find the hadoop configuration files and the appropriate JARs for the system being accessed. At least, it's basically the state of the art for compatibility right now if you want to call it that. 2. To allow the tricks above to work at runtime, the hadoop artifacts have been marked as optional in the POM, which means that they will not be included in the binary distribution. That's fine, because they are only needed if the HDFS Sink is used, and we jump through hoops to find those artifacts if they're on the system. As a result, I am able to compile Flume against the default hadoop version (0.20.205.0) and run against versions of Hadoop that I didn't explicitly build against, like 0.23.x, without a problem. This is a huge improvement over when I was working on this last week, where I was adding/fixing profiles to get anything to work at all, since it's well known that different hadoop versions generally refuse to talk to each other. I also refactored the flume-ng script to duplicate less code and be a bit friendlier, since I was doing surgery in there anyway. This is ready for review now. Summary ------- This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled. The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing. This addresses bug FLUME-1020 . https://issues.apache.org/jira/browse/FLUME-1020 Diffs (updated) bin/flume-ng 0796a5b flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e Diff: https://reviews.apache.org/r/4360/diff Testing ------- Thanks, Mike
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4360/#review6166
          -----------------------------------------------------------

          bin/flume-ng
          <https://reviews.apache.org/r/4360/#comment13279>

          Should we also try the bigtop autodetect script?

          http://svn.apache.org/repos/asf/incubator/bigtop/trunk/bigtop-packages/src/common/bigtop-utils/bigtop-detect-javahome

          This script: http://svn.apache.org/repos/asf/incubator/bigtop/trunk/bigtop-packages/src/common/hadoop/install_hadoop.sh

          calls the script as follows:

          1. Autodetect JAVA_HOME if not defined
            if [ -e /usr/libexec/bigtop-detect-javahome ]; then
            . /usr/libexec/bigtop-detect-javahome
            elif [ -e /usr/lib/bigtop-utils/bigtop-detect-javahome ]; then
            . /usr/lib/bigtop-utils/bigtop-detect-javahome
            fi

          flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java
          <https://reviews.apache.org/r/4360/#comment13277>

          The only place this method is called we have previously checked to see if security is enabled. Why do this check in both places?

          flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java
          <https://reviews.apache.org/r/4360/#comment13278>

          This is good debug info. But if it fails here, we have already logged in. Should we be returning false?

          • Brock

          On 2012-03-20 22:33:43, Mike Percy wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/4360/

          -----------------------------------------------------------

          (Updated 2012-03-20 22:33:43)

          Review request for Flume.

          Summary

          -------

          This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled.

          The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing.

          This addresses bug FLUME-1020.

          https://issues.apache.org/jira/browse/FLUME-1020

          Diffs

          -----

          bin/flume-ng 0796a5b

          flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf

          flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e

          Diff: https://reviews.apache.org/r/4360/diff

          Testing

          -------

          Thanks,

          Mike

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4360/#review6166 ----------------------------------------------------------- bin/flume-ng < https://reviews.apache.org/r/4360/#comment13279 > Should we also try the bigtop autodetect script? http://svn.apache.org/repos/asf/incubator/bigtop/trunk/bigtop-packages/src/common/bigtop-utils/bigtop-detect-javahome This script: http://svn.apache.org/repos/asf/incubator/bigtop/trunk/bigtop-packages/src/common/hadoop/install_hadoop.sh calls the script as follows: Autodetect JAVA_HOME if not defined if [ -e /usr/libexec/bigtop-detect-javahome ]; then . /usr/libexec/bigtop-detect-javahome elif [ -e /usr/lib/bigtop-utils/bigtop-detect-javahome ]; then . /usr/lib/bigtop-utils/bigtop-detect-javahome fi flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java < https://reviews.apache.org/r/4360/#comment13277 > The only place this method is called we have previously checked to see if security is enabled. Why do this check in both places? flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java < https://reviews.apache.org/r/4360/#comment13278 > This is good debug info. But if it fails here, we have already logged in. Should we be returning false? Brock On 2012-03-20 22:33:43, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4360/ ----------------------------------------------------------- (Updated 2012-03-20 22:33:43) Review request for Flume. Summary ------- This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled. The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing. This addresses bug FLUME-1020 . https://issues.apache.org/jira/browse/FLUME-1020 Diffs ----- bin/flume-ng 0796a5b flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e Diff: https://reviews.apache.org/r/4360/diff Testing ------- Thanks, Mike
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4360/
          -----------------------------------------------------------

          (Updated 2012-03-22 08:06:39.589736)

          Review request for Flume.

          Changes
          -------

          Brock, thanks for all the feedback!

          I am now looking for the bigtop JAVA_HOME detection script and calling it if it's there.

          I've also incorporated more suggestions from Roman, including using slf4j 1.6.1 which Hadoop and Zookeeper are using. I'm also excluding slf4j from the hadoop classpath when it's injected into Flume's classpath to avoid warnings in the log when it's an older version of Hadoop.

          Also incorporated the suggestions regarding not checking twice and incorporated some debug messages to indicate overall success or failure.

          I tested this all on a Kerberos cluster and it seems to work well.

          Summary
          -------

          This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled.

          The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing.

          This addresses bug FLUME-1020.
          https://issues.apache.org/jira/browse/FLUME-1020

          Diffs (updated)


          bin/flume-ng 0796a5b
          flume-ng-core/src/main/java/org/apache/flume/tools/GetJavaProperty.java PRE-CREATION
          flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf
          flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e
          pom.xml 227bca8

          Diff: https://reviews.apache.org/r/4360/diff

          Testing
          -------

          Thanks,

          Mike

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4360/ ----------------------------------------------------------- (Updated 2012-03-22 08:06:39.589736) Review request for Flume. Changes ------- Brock, thanks for all the feedback! I am now looking for the bigtop JAVA_HOME detection script and calling it if it's there. I've also incorporated more suggestions from Roman, including using slf4j 1.6.1 which Hadoop and Zookeeper are using. I'm also excluding slf4j from the hadoop classpath when it's injected into Flume's classpath to avoid warnings in the log when it's an older version of Hadoop. Also incorporated the suggestions regarding not checking twice and incorporated some debug messages to indicate overall success or failure. I tested this all on a Kerberos cluster and it seems to work well. Summary ------- This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled. The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing. This addresses bug FLUME-1020 . https://issues.apache.org/jira/browse/FLUME-1020 Diffs (updated) bin/flume-ng 0796a5b flume-ng-core/src/main/java/org/apache/flume/tools/GetJavaProperty.java PRE-CREATION flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e pom.xml 227bca8 Diff: https://reviews.apache.org/r/4360/diff Testing ------- Thanks, Mike
          Mike Percy made changes -
          Attachment FLUME-1020-9.patch [ 12519416 ]
          Mike Percy made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4360/#review6218
          -----------------------------------------------------------

          Ship it!

          +1

          Thanks for the patch Mike. Please attach it to the Jira. Also, it will be great if you can file a follow-up jira to remove the configuration constants from the system into their own separate class.

          • Arvind

          On 2012-03-22 08:06:39, Mike Percy wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/4360/

          -----------------------------------------------------------

          (Updated 2012-03-22 08:06:39)

          Review request for Flume.

          Summary

          -------

          This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled.

          The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing.

          This addresses bug FLUME-1020.

          https://issues.apache.org/jira/browse/FLUME-1020

          Diffs

          -----

          bin/flume-ng 0796a5b

          flume-ng-core/src/main/java/org/apache/flume/tools/GetJavaProperty.java PRE-CREATION

          flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf

          flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e

          pom.xml 227bca8

          Diff: https://reviews.apache.org/r/4360/diff

          Testing

          -------

          Thanks,

          Mike

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4360/#review6218 ----------------------------------------------------------- Ship it! +1 Thanks for the patch Mike. Please attach it to the Jira. Also, it will be great if you can file a follow-up jira to remove the configuration constants from the system into their own separate class. Arvind On 2012-03-22 08:06:39, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4360/ ----------------------------------------------------------- (Updated 2012-03-22 08:06:39) Review request for Flume. Summary ------- This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled. The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing. This addresses bug FLUME-1020 . https://issues.apache.org/jira/browse/FLUME-1020 Diffs ----- bin/flume-ng 0796a5b flume-ng-core/src/main/java/org/apache/flume/tools/GetJavaProperty.java PRE-CREATION flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e pom.xml 227bca8 Diff: https://reviews.apache.org/r/4360/diff Testing ------- Thanks, Mike
          Hide
          Arvind Prabhakar added a comment -

          Patch committed. Thanks Mike!

          Show
          Arvind Prabhakar added a comment - Patch committed. Thanks Mike!
          Arvind Prabhakar made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s v1.2.0 [ 12320243 ]
          Resolution Fixed [ 1 ]
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2012-03-22 08:25:35, Arvind Prabhakar wrote:

          > +1

          >

          > Thanks for the patch Mike. Please attach it to the Jira. Also, it will be great if you can file a follow-up jira to remove the configuration constants from the system into their own separate class.

          Thanks for committing this Arvind! I've filed https://issues.apache.org/jira/browse/FLUME-1044 to track removal of the config constants.

          • Mike

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/4360/#review6218
          -----------------------------------------------------------

          On 2012-03-22 08:06:39, Mike Percy wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/4360/

          -----------------------------------------------------------

          (Updated 2012-03-22 08:06:39)

          Review request for Flume.

          Summary

          -------

          This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled.

          The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing.

          This addresses bug FLUME-1020.

          https://issues.apache.org/jira/browse/FLUME-1020

          Diffs

          -----

          bin/flume-ng 0796a5b

          flume-ng-core/src/main/java/org/apache/flume/tools/GetJavaProperty.java PRE-CREATION

          flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf

          flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e

          pom.xml 227bca8

          Diff: https://reviews.apache.org/r/4360/diff

          Testing

          -------

          Thanks,

          Mike

          Show
          jiraposter@reviews.apache.org added a comment - On 2012-03-22 08:25:35, Arvind Prabhakar wrote: > +1 > > Thanks for the patch Mike. Please attach it to the Jira. Also, it will be great if you can file a follow-up jira to remove the configuration constants from the system into their own separate class. Thanks for committing this Arvind! I've filed https://issues.apache.org/jira/browse/FLUME-1044 to track removal of the config constants. Mike ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4360/#review6218 ----------------------------------------------------------- On 2012-03-22 08:06:39, Mike Percy wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4360/ ----------------------------------------------------------- (Updated 2012-03-22 08:06:39) Review request for Flume. Summary ------- This is an initial pass at an implementation of HDFS security. I think it will probably work. Currently trying to get Kerberos to play nice with the cluster on my VM though, so I haven't successfully tested it yet. It still works when used on HDFS with security disabled. The only thing I don't like is in configure() when authentication fails I throw a FlumeException. I'll trace up and see how bad that would be but it seems likely to break something. Just logging the error is kind of a bummer as well, though ... need to ensure process() doesn't fill up the disk while spewing copious error messages into the logs. Maybe this is a use case for some kind of FatalException type thing. This addresses bug FLUME-1020 . https://issues.apache.org/jira/browse/FLUME-1020 Diffs ----- bin/flume-ng 0796a5b flume-ng-core/src/main/java/org/apache/flume/tools/GetJavaProperty.java PRE-CREATION flume-ng-sinks/flume-hdfs-sink/pom.xml 1a35baf flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java da82f7e pom.xml 227bca8 Diff: https://reviews.apache.org/r/4360/diff Testing ------- Thanks, Mike
          Hide
          Hudson added a comment -

          Integrated in flume-trunk #137 (See https://builds.apache.org/job/flume-trunk/137/)
          FLUME-1020. Implement Kerberos security for HDFS Sink.

          (Mike Percy via Arvind Prabhakar) (Revision 1303685)

          Result = SUCCESS
          arvind : http://svn.apache.org/viewvc/?view=rev&rev=1303685
          Files :

          • /incubator/flume/trunk/bin/flume-ng
          • /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/tools
          • /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/tools/GetJavaProperty.java
          • /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/pom.xml
          • /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java
          • /incubator/flume/trunk/pom.xml
          Show
          Hudson added a comment - Integrated in flume-trunk #137 (See https://builds.apache.org/job/flume-trunk/137/ ) FLUME-1020 . Implement Kerberos security for HDFS Sink. (Mike Percy via Arvind Prabhakar) (Revision 1303685) Result = SUCCESS arvind : http://svn.apache.org/viewvc/?view=rev&rev=1303685 Files : /incubator/flume/trunk/bin/flume-ng /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/tools /incubator/flume/trunk/flume-ng-core/src/main/java/org/apache/flume/tools/GetJavaProperty.java /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/pom.xml /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java /incubator/flume/trunk/pom.xml
          Mike Percy made changes -
          Link This issue breaks FLUME-1046 [ FLUME-1046 ]

            People

            • Assignee:
              Mike Percy
              Reporter:
              Arvind Prabhakar
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development