Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1150

Verify datanodes' identities to clients in secure clusters

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.0
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently we use block access tokens to allow datanodes to verify clients' identities, however we don't have a way for clients to verify the authenticity of the datanodes themselves.

      1. RequireSecurePorts.patch
        1 kB
        Jakob Homan
      2. HDFS-1150-trunk-3.patch
        29 kB
        Jakob Homan
      3. HDFS-1150-trunk-2.patch
        82 kB
        Jakob Homan
      4. HDFS-1150-trunk.patch
        29 kB
        Jakob Homan
      5. HDFS-1150-Y20-BetterJsvcHandling.patch
        6 kB
        Jakob Homan
      6. HDFS-1150-BF-Y20-LOG-DIRS-2.patch
        4 kB
        Jakob Homan
      7. HDFS-1150-BF-Y20-LOG-DIRS.patch
        3 kB
        Jakob Homan
      8. hdfs-1150-bugfix-1.2.patch
        5 kB
        Jitendra Nath Pandey
      9. commons-daemon-1.0.2-src.tar.gz
        379 kB
        Jitendra Nath Pandey
      10. hdfs-1150-bugfix-1.1.patch
        5 kB
        Jitendra Nath Pandey
      11. hdfs-1150-bugfix-1.patch
        0.4 kB
        Devaraj Das
      12. HDFS-1150-BF1-Y20.patch
        1 kB
        Jakob Homan
      13. HDFS-1150-Y20S-ready-8.patch
        32 kB
        Jakob Homan
      14. HDFS-1150-Y20S-ready-7.patch
        31 kB
        Jakob Homan
      15. HDFS-1150-Y20S-ready-6.patch
        31 kB
        Jakob Homan
      16. HDFS-1150-Y20S-ready-5.patch
        30 kB
        Jakob Homan
      17. HDFS-1150-Y20S-Rough-4.patch
        30 kB
        Jakob Homan
      18. HDFS-1150-Y20S-Rough-3.patch
        30 kB
        Jakob Homan
      19. HDFS-1150-y20.build-script.patch
        2 kB
        Jitendra Nath Pandey
      20. HDFS-1150-Y20S-Rough-2.patch
        29 kB
        Jakob Homan
      21. HDFS-1150-Y20S-Rough.txt
        29 kB
        Jakob Homan

        Issue Links

          Activity

          Hide
          Allen Wittenauer added a comment -

          Chances are good that $HADOOP_HOME was valid during the tests so the failure wouldn't have been detected.

          Show
          Allen Wittenauer added a comment - Chances are good that $HADOOP_HOME was valid during the tests so the failure wouldn't have been detected.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I should have given some more background on why I looking at this:

          It is because of HDFS-1781, which says that the path for jsvc is incorrect. Below is the fix.

          -  exec "$HADOOP_HOME/bin/jsvc" \
          +  exec "$HADOOP_HDFS_HOME/bin/jsvc" \
          

          Then, I wonder that did we ever run it before committing the patch? Probably not.

          Anyway, thanks Jakob for saying to open a JIRA.

          Show
          Tsz Wo Nicholas Sze added a comment - I should have given some more background on why I looking at this: It is because of HDFS-1781 , which says that the path for jsvc is incorrect. Below is the fix. - exec "$HADOOP_HOME/bin/jsvc" \ + exec "$HADOOP_HDFS_HOME/bin/jsvc" \ Then, I wonder that did we ever run it before committing the patch? Probably not. Anyway, thanks Jakob for saying to open a JIRA.
          Hide
          Jakob Homan added a comment -

          If you look back at the intermediate patches, you'll see that for quite a while we were downloading the jsvc source and compiling it. This caused problems (one having to do with Y! building on 64-bit machines but deploying against 32-bit jvms, and there were others, as I recall) and so we went with downloading jsvc directly so we could avoid compilation and specify a specific version. No one is arguing this is ideal and we're agreeing it can be improved and have I've said I'll open a JIRA to do so. jsvc can be more tightly integrated in the native build process.

          Show
          Jakob Homan added a comment - If you look back at the intermediate patches, you'll see that for quite a while we were downloading the jsvc source and compiling it. This caused problems (one having to do with Y! building on 64-bit machines but deploying against 32-bit jvms, and there were others, as I recall) and so we went with downloading jsvc directly so we could avoid compilation and specify a specific version. No one is arguing this is ideal and we're agreeing it can be improved and have I've said I'll open a JIRA to do so. jsvc can be more tightly integrated in the native build process.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I should clarify this: my comment in the hard coded URL is not saying that we should support Windows. The problem is the hard coded URL.

          Show
          Tsz Wo Nicholas Sze added a comment - I should clarify this: my comment in the hard coded URL is not saying that we should support Windows. The problem is the hard coded URL .
          Hide
          Allen Wittenauer added a comment -

          Who says I'm happy about it? We've always done a pretty horrible job on portability. Why should security be any different?

          That said, I agree with Jakob that Windows is a mostly wasted effort. But that's a discussion for a different Jira.

          Show
          Allen Wittenauer added a comment - Who says I'm happy about it? We've always done a pretty horrible job on portability. Why should security be any different? That said, I agree with Jakob that Windows is a mostly wasted effort. But that's a discussion for a different Jira.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Jakob, thanks for the response and hope you could continue contributing to Hadoop by providing patches.

          Allen, also thanks for your comments (and I am surprise that you seems happy when security won't work on Solaris.)

          Show
          Tsz Wo Nicholas Sze added a comment - Jakob, thanks for the response and hope you could continue contributing to Hadoop by providing patches. Allen, also thanks for your comments (and I am surprise that you seems happy when security won't work on Solaris.)
          Hide
          Jakob Homan added a comment -

          Nicholas-
          Regarding the packaging question you asked yesterday (open-source development and collaboration sometimes does have a bit of lag), you raise a good point that wasn't addressed by the original patch. As you can see, this patch was written in the middle of the security work under tight deadlines. There is certainly room for improvement. I will file a follow-up JIRA we and we can improve this. After you noticed this, I spoke with Allen yesterday to brainstorm ways to improve it, hopefully as part of the ongoing packaging discussions, but would love to have your input.
          As far as Windows goes, as pointed out above, we've never supported security under Windows. As far as I know, you're the final HDFS developer to be working day to day under a Windows environment. Perhaps it's time the community took another look whether or not it's still worth it to provide this support. Patches to make new features work under Windows after their original *nix patches are common; please file one for this, if you like.

          Show
          Jakob Homan added a comment - Nicholas- Regarding the packaging question you asked yesterday (open-source development and collaboration sometimes does have a bit of lag), you raise a good point that wasn't addressed by the original patch. As you can see, this patch was written in the middle of the security work under tight deadlines. There is certainly room for improvement. I will file a follow-up JIRA we and we can improve this. After you noticed this, I spoke with Allen yesterday to brainstorm ways to improve it, hopefully as part of the ongoing packaging discussions, but would love to have your input. As far as Windows goes, as pointed out above, we've never supported security under Windows. As far as I know, you're the final HDFS developer to be working day to day under a Windows environment. Perhaps it's time the community took another look whether or not it's still worth it to provide this support. Patches to make new features work under Windows after their original *nix patches are common; please file one for this, if you like.
          Hide
          Allen Wittenauer added a comment -

          We don't advertise support for security to work on anything but Linux. So picking an invalid case doesn't mean it failed.

          Show
          Allen Wittenauer added a comment - We don't advertise support for security to work on anything but Linux. So picking an invalid case doesn't mean it failed.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Allen,

          I am not saying we should make it work on Windows but I just wonder if Jakob knows the answer of your question since I have no idea.

          For the hard coded jsvc.location problem, Jakob seems saying that it will work using -Djsvc.location. Then, I tried but it failed. So I like to ask him for follow-up because it seems that he stops responding.

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Allen, I am not saying we should make it work on Windows but I just wonder if Jakob knows the answer of your question since I have no idea. For the hard coded jsvc.location problem, Jakob seems saying that it will work using -Djsvc.location . Then, I tried but it failed. So I like to ask him for follow-up because it seems that he stops responding.
          Hide
          Allen Wittenauer added a comment -

          I don't see why it is Jakob's responsibility to answer my Windows compatibility question.

          Let's face it: the probability that the secure Hadoop code functions on Windows is low. We all sort of agreed that the security features would fall into that bucket of stuff where we (right or wrong) allow for non-portability.

          But let me clarify my point, since I think you missed it: If one can't run a Hadoop secure cluster on Windows, then the fact that Apache's default Windows distribution method for the Windows binary being zip is irrelevant. The person doing the portability work for secure Hadoop on Windows would likely need to either fix this code or (and much more likely, see privileges on Solaris, Todd's capabilities work for RHEL, etc) use a different method to guarantee that the DN starts on a privileged port.

          On the issue of builds, we already require that users pass flags to determine whether they want to build with 32-bit or 64-bit. This isn't any different than any of those, realistically. Given the push to use packaging for the next release (in addition to a tarball), then the appropriate binary will be in the appropriate package. The tarball including any native code was likely a mistake. We should have really made a separate "overlay" tarball that would be applied over the non-architecture specific one.

          Show
          Allen Wittenauer added a comment - I don't see why it is Jakob's responsibility to answer my Windows compatibility question. Let's face it: the probability that the secure Hadoop code functions on Windows is low. We all sort of agreed that the security features would fall into that bucket of stuff where we (right or wrong) allow for non-portability. But let me clarify my point, since I think you missed it: If one can't run a Hadoop secure cluster on Windows, then the fact that Apache's default Windows distribution method for the Windows binary being zip is irrelevant. The person doing the portability work for secure Hadoop on Windows would likely need to either fix this code or (and much more likely, see privileges on Solaris, Todd's capabilities work for RHEL, etc) use a different method to guarantee that the DN starts on a privileged port. On the issue of builds, we already require that users pass flags to determine whether they want to build with 32-bit or 64-bit. This isn't any different than any of those, realistically. Given the push to use packaging for the next release (in addition to a tarball), then the appropriate binary will be in the appropriate package. The tarball including any native code was likely a mistake. We should have really made a separate "overlay" tarball that would be applied over the non-architecture specific one.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Jakob, any responses on the followings:

          Since you have provided and committed the patch, I hope you could follow up.

          Show
          Tsz Wo Nicholas Sze added a comment - Jakob, any responses on the followings: My question on release My comment on -Djsvc.location is not working Allen's question on windows compilation Since you have provided and committed the patch, I hope you could follow up.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > Does the JNI/C/C++ code that is needed for security even compile on Windows?

          No idea.

          Show
          Tsz Wo Nicholas Sze added a comment - > Does the JNI/C/C++ code that is needed for security even compile on Windows? No idea.
          Hide
          Allen Wittenauer added a comment -

          Does the JNI/C/C++ code that is needed for security even compile on Windows?

          Show
          Allen Wittenauer added a comment - Does the JNI/C/C++ code that is needed for security even compile on Windows?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          >> By specifying -Djsvc.location ...
          >
          > Just tried. The build failed.

          The reason is that the committed codes expect a .gz file but the jsvc release is a .zip file.

          Show
          Tsz Wo Nicholas Sze added a comment - >> By specifying -Djsvc.location ... > > Just tried. The build failed. The reason is that the committed codes expect a .gz file but the jsvc release is a .zip file.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > By specifying -Djsvc.location ...

          Just tried. The build failed.

          $ ant jsvc -Djsvc.location=http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/windows/commons-daemon-1.0.2-bin-windows.zip
          Buildfile: build.xml
          
          jsvc:
              [mkdir] Created dir: z:\_tsz\hadoop\hdfs\h1\build\jsvc
                [get] Getting: http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/windows/commons-daemon-1.0.2-bin-windows.zip
                [get] To: z:\_tsz\hadoop\hdfs\h1\build\jsvc\jsvc.tar.gz
          
          BUILD FAILED
          z:\_tsz\hadoop\hdfs\h1\build.xml:1865: Error while expanding z:\_tsz\hadoop\hdfs\h1\build\jsvc\jsvc.tar.gz
          
          Total time: 1 second
          
          Show
          Tsz Wo Nicholas Sze added a comment - > By specifying -Djsvc.location ... Just tried. The build failed. $ ant jsvc -Djsvc.location=http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/windows/commons-daemon-1.0.2-bin-windows.zip Buildfile: build.xml jsvc: [mkdir] Created dir: z:\_tsz\hadoop\hdfs\h1\build\jsvc [get] Getting: http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/windows/commons-daemon-1.0.2-bin-windows.zip [get] To: z:\_tsz\hadoop\hdfs\h1\build\jsvc\jsvc.tar.gz BUILD FAILED z:\_tsz\hadoop\hdfs\h1\build.xml:1865: Error while expanding z:\_tsz\hadoop\hdfs\h1\build\jsvc\jsvc.tar.gz Total time: 1 second
          Hide
          Allen Wittenauer added a comment -

          Ugh. We really need to do better documentation on how to build Hadoop, esp wrt the security features. I know I didn't know about the jsvc.location ant param. (But then again, I don't build as often as most of you.)

          FWIW, we're going to have this problem with all compiled code that doesn't use architecture-aware naming, even in some cases between 32-bit and 64-bit on Linux, without a lot of work in various locations. (Oh how I wish isaexec() and isalist() was on all platforms!)

          Show
          Allen Wittenauer added a comment - Ugh. We really need to do better documentation on how to build Hadoop, esp wrt the security features. I know I didn't know about the jsvc.location ant param. (But then again, I don't build as often as most of you.) FWIW, we're going to have this problem with all compiled code that doesn't use architecture-aware naming, even in some cases between 32-bit and 64-bit on Linux, without a lot of work in various locations. (Oh how I wish isaexec() and isalist() was on all platforms!)
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Please correct me if I am wrong: When we have a release, it will include this linux-i386 specific lib in the tar. Then, if some users try to install our release on other machines, will it be just not work?

          Show
          Tsz Wo Nicholas Sze added a comment - Please correct me if I am wrong: When we have a release, it will include this linux-i386 specific lib in the tar. Then, if some users try to install our release on other machines, will it be just not work?
          Hide
          Jakob Homan added a comment -

          By specifying Djsvc.location one can specify a different location or version from which to pull the jsvc package. The package is expanded as part of the build process. Thus, if one needs a different platform, they can specify it here. jsvc on OSX was really flaky, as I remember it, but this was tested against 64 and 32-bit versions of jsvc.

          Show
          Jakob Homan added a comment - By specifying Djsvc.location one can specify a different location or version from which to pull the jsvc package. The package is expanded as part of the build process. Thus, if one needs a different platform, they can specify it here. jsvc on OSX was really flaky, as I remember it, but this was tested against 64 and 32-bit versions of jsvc.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          How could "-Dparameter" be used exactly?

          Show
          Tsz Wo Nicholas Sze added a comment - How could "-Dparameter" be used exactly?
          Hide
          Jakob Homan added a comment -

          we went back and forth on this for a long time, and eventually compromised on this since this was the most often used version. There is a -Dparameter you can pass to package a different version of jsvc. I agree, this isn't ideal.

          Show
          Jakob Homan added a comment - we went back and forth on this for a long time, and eventually compromised on this since this was the most often used version. There is a -Dparameter you can pass to package a different version of jsvc. I agree, this isn't ideal.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          We probably should not hard code the jsvc location.

          +  <property name="jsvc.location" value="http://apache.org/dist/commons/daemon/binaries/1.0.2/linux/commons-daemon-1.0.2-bin-linux-i386.tar.gz" />
          

          BTW, would this work for non-linux or non-i386 architecture?

          Show
          Tsz Wo Nicholas Sze added a comment - We probably should not hard code the jsvc location. + <property name= "jsvc.location" value= "http: //apache.org/dist/commons/daemon/binaries/1.0.2/linux/commons-daemon-1.0.2-bin-linux-i386.tar.gz" /> BTW, would this work for non-linux or non-i386 architecture?
          Hide
          Jakob Homan added a comment -

          Small follow-up patch. Our Ops team had requested to not have the secure datanode bail out, but rather give a warning, if non-privileged ports were specified, during the transition to secure ports. Now that this has been done, this patch changes the secure datanode to throw a RTE if provided with non-privileged ports. This is for Y!20 only; trunk already has this behavior.

          Show
          Jakob Homan added a comment - Small follow-up patch. Our Ops team had requested to not have the secure datanode bail out, but rather give a warning, if non-privileged ports were specified, during the transition to secure ports. Now that this has been done, this patch changes the secure datanode to throw a RTE if provided with non-privileged ports. This is for Y!20 only; trunk already has this behavior.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #370 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/370/)
          HDFS-1157. Modifications introduced by HDFS-1150 are breaking aspect's bindings. Contributed by Konstantin Boudnik

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #370 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/370/ ) HDFS-1157 . Modifications introduced by HDFS-1150 are breaking aspect's bindings. Contributed by Konstantin Boudnik
          Hide
          Jakob Homan added a comment -

          Thanks for the review, Devaraj. I've committed this. Resolving as fixed.

          Show
          Jakob Homan added a comment - Thanks for the review, Devaraj. I've committed this. Resolving as fixed.
          Hide
          Devaraj Das added a comment -

          +1

          Show
          Devaraj Das added a comment - +1
          Hide
          Jakob Homan added a comment -

          Hudson's gone AWOL again. Manually did tests. All pass.
          Test-patch:

               [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          Show
          Jakob Homan added a comment - Hudson's gone AWOL again. Manually did tests. All pass. Test-patch: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          Hide
          Jakob Homan added a comment -

          submitting patch now that the latest common jar has been pushed to maven...

          Show
          Jakob Homan added a comment - submitting patch now that the latest common jar has been pushed to maven...
          Hide
          Jakob Homan added a comment -

          Ahhh... I'd forgotten to merge changes in from trunk; sorry about that. Here's a good version.

          Show
          Jakob Homan added a comment - Ahhh... I'd forgotten to merge changes in from trunk; sorry about that. Here's a good version.
          Hide
          Devaraj Das added a comment -

          Looks like you uploaded a wrong patch.

          Show
          Devaraj Das added a comment - Looks like you uploaded a wrong patch.
          Hide
          Jakob Homan added a comment -

          Updated Javadoc per Devaraj's review. Common artifacts aren't being pushed to Maven at the moment, so test-patch will fail without the common part. I'll wait a few hours to see if this gets fixed. Otherwise, I'll run the tests overnight.

          Show
          Jakob Homan added a comment - Updated Javadoc per Devaraj's review. Common artifacts aren't being pushed to Maven at the moment, so test-patch will fail without the common part. I'll wait a few hours to see if this gets fixed. Otherwise, I'll run the tests overnight.
          Hide
          Devaraj Das added a comment -

          Apart from missing javadoc in some places like instantiateDataNode, it looks good. Hope that the discussion on other approaches to secure the DN process is fruitful in the other jira.

          Show
          Devaraj Das added a comment - Apart from missing javadoc in some places like instantiateDataNode, it looks good. Hope that the discussion on other approaches to secure the DN process is fruitful in the other jira.
          Hide
          Jakob Homan added a comment -

          I've opened HDFS-1326 for adding the new ability to specify a non-jsvc method of securing the datanodes' non-RPC ports.

          Show
          Jakob Homan added a comment - I've opened HDFS-1326 for adding the new ability to specify a non-jsvc method of securing the datanodes' non-RPC ports.
          Hide
          Jakob Homan added a comment -

          Do whatever you like. As I've said repeatedly, this patch is going to mirror the 20 patch. I'm not changing it, but will be happy to add this new feature in another JIRA.

          Show
          Jakob Homan added a comment - Do whatever you like. As I've said repeatedly, this patch is going to mirror the 20 patch. I'm not changing it, but will be happy to add this new feature in another JIRA.
          Hide
          Todd Lipcon added a comment -

          Regardless, this is a significant change to the current patch

          Apologies if I haven't been clear, but the only change I'm asking for from the current patch is the following one line addition:

          -    if(UserGroupInformation.isSecurityEnabled() && resources == null)
          +    if(UserGroupInformation.isSecurityEnabled() && resources == null &&
          +       conf.getBoolean("dfs.datanode.require.secure.ports", true))
          

          If you want me to open this one line change as a separate JIRA, I'll do so, but it seemed easier to just modify the current patch. I'm not saying we need a full pluggability framework at this point – only the ability to start a secure DN without jsvc when we've already secured it through another external mechanism. Note that the default I'm recommending doesn't change the behavior you'd like, and we can leave this as an undocumented config for advanced users only, so we can feel free to change it if we add some pluggability here.

          Show
          Todd Lipcon added a comment - Regardless, this is a significant change to the current patch Apologies if I haven't been clear, but the only change I'm asking for from the current patch is the following one line addition: - if (UserGroupInformation.isSecurityEnabled() && resources == null ) + if (UserGroupInformation.isSecurityEnabled() && resources == null && + conf.getBoolean( "dfs.datanode.require.secure.ports" , true )) If you want me to open this one line change as a separate JIRA, I'll do so, but it seemed easier to just modify the current patch. I'm not saying we need a full pluggability framework at this point – only the ability to start a secure DN without jsvc when we've already secured it through another external mechanism. Note that the default I'm recommending doesn't change the behavior you'd like, and we can leave this as an undocumented config for advanced users only, so we can feel free to change it if we add some pluggability here.
          Hide
          Jakob Homan added a comment -

          I would in fact argue that both of these solutions are more secure

          OK, do so if you wish. But I'm not sure who you're arguing with since I'm trying to work with you to get your request into the code and have already agreed that the jsvc approach is not perfect. Regardless, this is a significant change to the current patch and, after all of the work that we've done, we're trying to keep the y20 and trunk patches as similar as possible. Therefore, as I wrote above, I will be happy to open a new JIRA to add this functionality once 1150 has been finished off (https://issues.apache.org/jira/browse/HDFS-1150?focusedCommentId=12894236&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12894236).

          Show
          Jakob Homan added a comment - I would in fact argue that both of these solutions are more secure OK, do so if you wish. But I'm not sure who you're arguing with since I'm trying to work with you to get your request into the code and have already agreed that the jsvc approach is not perfect. Regardless, this is a significant change to the current patch and, after all of the work that we've done, we're trying to keep the y20 and trunk patches as similar as possible. Therefore, as I wrote above, I will be happy to open a new JIRA to add this functionality once 1150 has been finished off ( https://issues.apache.org/jira/browse/HDFS-1150?focusedCommentId=12894236&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12894236 ).
          Hide
          Todd Lipcon added a comment -

          If your "support" for security comes with quotation marks, you've got a problem.

          The quotes were to say that we don't need to explicitly add support in order to use an external mechanism. Securing a high port with SELinux is equally as good as the jsvc solution that uses a low port. On Solaris you can use user-based privileges to grant the hdfs user access to bind to a low port. This is also at least as good as the jsvc solution and doesn't require any code changes to Hadoop. I would in fact argue that both of these solutions are more secure, since the Hadoop administrators don't need root on the system except for the initial host configuration.

          Administrators would need to affirmatively decline this type of protection, perhaps with a value to the key of "No, thanks."

          Hence my above point that it should be a configuration with the default as you suggested – something that advanced users (and developers) can override.

          I also disagree with your general point that Hadoop should make it impossible to misconfigure it in such a way that there are security holes. Already with your solution you're relying on ops to provide some of the security - for example, if users have root on any machine in the same subnet, they can take over the IP of one of the datanodes by spamming arps. So long as we're taking the shortcut instead of actually putting SASL on the xceiver protocol, we need external security. I agree completely with the decision to take the workaround in the short term, but making arguments about "security" vs real security seems strange given the context.

          Show
          Todd Lipcon added a comment - If your "support" for security comes with quotation marks, you've got a problem. The quotes were to say that we don't need to explicitly add support in order to use an external mechanism. Securing a high port with SELinux is equally as good as the jsvc solution that uses a low port. On Solaris you can use user-based privileges to grant the hdfs user access to bind to a low port. This is also at least as good as the jsvc solution and doesn't require any code changes to Hadoop. I would in fact argue that both of these solutions are more secure, since the Hadoop administrators don't need root on the system except for the initial host configuration. Administrators would need to affirmatively decline this type of protection, perhaps with a value to the key of "No, thanks." Hence my above point that it should be a configuration with the default as you suggested – something that advanced users (and developers) can override. I also disagree with your general point that Hadoop should make it impossible to misconfigure it in such a way that there are security holes. Already with your solution you're relying on ops to provide some of the security - for example, if users have root on any machine in the same subnet, they can take over the IP of one of the datanodes by spamming arps. So long as we're taking the shortcut instead of actually putting SASL on the xceiver protocol, we need external security. I agree completely with the decision to take the workaround in the short term, but making arguments about "security" vs real security seems strange given the context.
          Hide
          Jakob Homan added a comment -

          The point is that we already "support" the SELinux way

          If your "support" for security comes with quotation marks, you've got a problem.

          Think of it like i.already.secured.my.datanode.port.with.some.external.mechanism

          This is a more reasonable way of arguing your request. Essentially it's the beginning of a pluggable system where the verification is Ops' word that they've taken care of this security hole. My concern remains that this provides no way for a running cluster to realize it's been misconfigured, although I've been thinking we need a security info page on the NN/JT (along with HADOOP-6823 and HADOOP-6822) and this could be displayed there (although the danger for non-updated, erroneous configs scattered around the cluster still remains). Administrators would need to affirmatively decline this type of protection, perhaps with a value to the key of "No, thanks."

          Show
          Jakob Homan added a comment - The point is that we already "support" the SELinux way If your "support" for security comes with quotation marks, you've got a problem. Think of it like i.already.secured.my.datanode.port.with.some.external.mechanism This is a more reasonable way of arguing your request. Essentially it's the beginning of a pluggable system where the verification is Ops' word that they've taken care of this security hole. My concern remains that this provides no way for a running cluster to realize it's been misconfigured, although I've been thinking we need a security info page on the NN/JT (along with HADOOP-6823 and HADOOP-6822 ) and this could be displayed there (although the danger for non-updated, erroneous configs scattered around the cluster still remains). Administrators would need to affirmatively decline this type of protection, perhaps with a value to the key of "No, thanks."
          Hide
          Todd Lipcon added a comment -

          Until we support non-jsvc methods of doing this, it's not going to work to have a non-jsvc verified datanode

          The point is that we already "support" the SELinux way - you configure the datanode to a high port, and then set up the SELinux policy to only allow the HDFS user to bind that high port. No Hadoop-side support is necessary, but the current implementation prohibits this mechanism, which I don't think is right.

          it would essentially be my.cluster.is.secure.except.for.this.one.attack.vector, which is not a good idea for the same reasons as above

          Think of it like i.already.secured.my.datanode.port.with.some.external.mechanism I'm OK with this config being default to false (ie refuse to start non-secure), but it needs to be configurable. The developer testing case you mentioned is another good example.

          Show
          Todd Lipcon added a comment - Until we support non-jsvc methods of doing this, it's not going to work to have a non-jsvc verified datanode The point is that we already "support" the SELinux way - you configure the datanode to a high port, and then set up the SELinux policy to only allow the HDFS user to bind that high port. No Hadoop-side support is necessary, but the current implementation prohibits this mechanism, which I don't think is right. it would essentially be my.cluster.is.secure.except.for.this.one.attack.vector, which is not a good idea for the same reasons as above Think of it like i.already.secured.my.datanode.port.with.some.external.mechanism I'm OK with this config being default to false (ie refuse to start non-secure), but it needs to be configurable. The developer testing case you mentioned is another good example.
          Hide
          Jakob Homan added a comment -

          Sure. -1 on allowing unsecured datanodes to join a secure cluster, and at the moment Hadoop doesn't have a non-jsvc way of securing/verifying datanodes' ports.

          Currently, we secure the datanodes via jsvc, and the reasons for doing so were discussed extensively on this JIRA. Were we to allow the behavior requested, a mis-configured cluster could end up partially unsecured with no warning that it is in such a state, which is not acceptable.

          What you're asking for is essentially to make securing the datanodes' non-RPC ports pluggable, which we fully expect and plan to do. I'll open a JIRA to make datanode-port security pluggable once 1150 has been finished off. jsvc was a reliable solution to a problem discovered very late in security's development, which has worked very well on our production clusters, but certainly still has the odor of a hack about it. All that's needed is a way of auditing and verifying that the ports we're running are on are secure by Ops' estimation; jsvc, SELinux, AppArmor will all be reasonable ways of fulfilling such a contract.

          But until we actually have a plan to implement this in a reliable, verifiable and documented way, it's best to err on the side of caution and security and provide as much guarantee as possible that the datanodes are indeed secure in a secure cluster. Until we support non-jsvc methods of doing this, it's not going to work to have a non-jsvc verified datanode.

          As far as a config as mentioned above, it would essentially be my.cluster.is.secure.except.for.this.one.attack.vector, which is not a good idea for the same reasons as above - it's a huge configuration mistake waiting to happen - and moreover will be unnecessary once a fully pluggable system is in place. The one place it would be very useful and justifiable would be for developer testing, since it is a serious pain to start up these secure nodes while doing development now.

          Show
          Jakob Homan added a comment - Sure. -1 on allowing unsecured datanodes to join a secure cluster, and at the moment Hadoop doesn't have a non-jsvc way of securing/verifying datanodes' ports. Currently, we secure the datanodes via jsvc, and the reasons for doing so were discussed extensively on this JIRA. Were we to allow the behavior requested, a mis-configured cluster could end up partially unsecured with no warning that it is in such a state, which is not acceptable. What you're asking for is essentially to make securing the datanodes' non-RPC ports pluggable, which we fully expect and plan to do. I'll open a JIRA to make datanode-port security pluggable once 1150 has been finished off. jsvc was a reliable solution to a problem discovered very late in security's development, which has worked very well on our production clusters, but certainly still has the odor of a hack about it. All that's needed is a way of auditing and verifying that the ports we're running are on are secure by Ops' estimation; jsvc, SELinux, AppArmor will all be reasonable ways of fulfilling such a contract. But until we actually have a plan to implement this in a reliable, verifiable and documented way, it's best to err on the side of caution and security and provide as much guarantee as possible that the datanodes are indeed secure in a secure cluster. Until we support non-jsvc methods of doing this, it's not going to work to have a non-jsvc verified datanode. As far as a config as mentioned above, it would essentially be my.cluster.is.secure.except.for.this.one.attack.vector, which is not a good idea for the same reasons as above - it's a huge configuration mistake waiting to happen - and moreover will be unnecessary once a fully pluggable system is in place. The one place it would be very useful and justifiable would be for developer testing, since it is a serious pain to start up these secure nodes while doing development now.
          Hide
          Todd Lipcon added a comment -

          Hi Jakob. Can you please address my comment above? Given that there are ways to secure a high port, I don't think it should be a requirement that a secure DN must start on a low port. We have several clusters where the Hadoop team does not have easy root access, but SELinux policies could be used to the same effect. If this requirement is important for you guys, maybe we can add a configurable like dfs.datanode.require.privileged.port or somesuch?

          Show
          Todd Lipcon added a comment - Hi Jakob. Can you please address my comment above? Given that there are ways to secure a high port, I don't think it should be a requirement that a secure DN must start on a low port. We have several clusters where the Hadoop team does not have easy root access, but SELinux policies could be used to the same effect. If this requirement is important for you guys, maybe we can add a configurable like dfs.datanode.require.privileged.port or somesuch?
          Hide
          Jakob Homan added a comment -

          Submitting patch, though I think trunk has been kaiboshed for the moment...

          Show
          Jakob Homan added a comment - Submitting patch, though I think trunk has been kaiboshed for the moment...
          Hide
          Jakob Homan added a comment -

          Patch for trunk, which seems to be broken at the moment. Relatively straight-forward port of 20 patch. New command

          {start,stop}

          -secure-dns.sh, to be executed to by root, to start the secure datanodes. bin/hdfs datanode and bin/hadoop-daemon start datanode both work. Starting a secure datanode with unprivileged ports will now throw an exception, a difference from the y20 patch, but one that provides better protection against badly configured datanodes.

          Show
          Jakob Homan added a comment - Patch for trunk, which seems to be broken at the moment. Relatively straight-forward port of 20 patch. New command {start,stop} -secure-dns.sh, to be executed to by root, to start the secure datanodes. bin/hdfs datanode and bin/hadoop-daemon start datanode both work. Starting a secure datanode with unprivileged ports will now throw an exception, a difference from the y20 patch, but one that provides better protection against badly configured datanodes.
          Hide
          Todd Lipcon added a comment -

          What about the case where root access is not available for jsvc but a secure cluster is required? There are other ways to secure the high port - eg you can use SELinux or AppArmor to restrict the ability to bind port 50010 to user "hdfs".

          I think this jsvc approach is a reasonable one, but it's not one-size-fits-all, and given there are reasonably simple external ways to get the same security, it should be a configurable requirement.

          Show
          Todd Lipcon added a comment - What about the case where root access is not available for jsvc but a secure cluster is required? There are other ways to secure the high port - eg you can use SELinux or AppArmor to restrict the ability to bind port 50010 to user "hdfs". I think this jsvc approach is a reasonable one, but it's not one-size-fits-all, and given there are reasonably simple external ways to get the same security, it should be a configurable requirement.
          Hide
          Jakob Homan added a comment -

          A warning makes sense, but it seems to me that it should only throw the RTE when the DN is configured on a priveleged port and not run with SecureResources.

          It's simpler to have all the datanodes in a secure cluster running via jsvc. For the trunk patch, the warning about higher ports will actually most likely be an exception, as it was in earlier versions of the patch. This behavior was left as a warning on 20 to help the transition by Ops to give them time to get the lower ports opened. Having a datanode refuse to start in a secure cluster if not running under jsvc is the easiest path to not accidentally getting one running without it; the current behavior is fine.

          Show
          Jakob Homan added a comment - A warning makes sense, but it seems to me that it should only throw the RTE when the DN is configured on a priveleged port and not run with SecureResources. It's simpler to have all the datanodes in a secure cluster running via jsvc. For the trunk patch, the warning about higher ports will actually most likely be an exception, as it was in earlier versions of the patch. This behavior was left as a warning on 20 to help the transition by Ops to give them time to get the lower ports opened. Having a datanode refuse to start in a secure cluster if not running under jsvc is the easiest path to not accidentally getting one running without it; the current behavior is fine.
          Hide
          Todd Lipcon added a comment -

          One quick note - currently this patch makes the DN refuse to start in a secure cluster unless it's being started through jsvc, even if it's configured to a high port. A warning makes sense, but it seems to me that it should only throw the RTE when the DN is configured on a priveleged port and not run with SecureResources.

          Show
          Todd Lipcon added a comment - One quick note - currently this patch makes the DN refuse to start in a secure cluster unless it's being started through jsvc, even if it's configured to a high port. A warning makes sense, but it seems to me that it should only throw the RTE when the DN is configured on a priveleged port and not run with SecureResources.
          Hide
          Jakob Homan added a comment -

          Updating patch, not for commit. We found that manually building the jsvc binary did not work well; there were problems if the build machine is of different architecture (we run 32 bit JVMs for DNs/TTs). New process is to download the file, either directly from Apache, or overridden via the -Djsvc.location param. This greatly simplifies the jsvc process.

          Also for those interested, in our testing here at Y!, this solution has worked great for securing the datanodes. We've run into no problems with running the datanodes under jsvc.

          Show
          Jakob Homan added a comment - Updating patch, not for commit. We found that manually building the jsvc binary did not work well; there were problems if the build machine is of different architecture (we run 32 bit JVMs for DNs/TTs). New process is to download the file, either directly from Apache, or overridden via the -Djsvc.location param. This greatly simplifies the jsvc process. Also for those interested, in our testing here at Y!, this solution has worked great for securing the datanodes. We've run into no problems with running the datanodes under jsvc.
          Hide
          Jakob Homan added a comment -

          Our Ops noticed the new jsvc pushes the classpath so long they can no longer monitor the class name, so adding in a fake env variable to identify the started process....

          Show
          Jakob Homan added a comment - Our Ops noticed the new jsvc pushes the classpath so long they can no longer monitor the class name, so adding in a fake env variable to identify the started process....
          Hide
          Jakob Homan added a comment -

          One more bug fix patch? Sure! Make logs go in the right place...

          Show
          Jakob Homan added a comment - One more bug fix patch? Sure! Make logs go in the right place...
          Hide
          Allen Wittenauer added a comment -

          Why can't it use the 'regular Hadoop RPC'?

          Altho at this point, I just don't care about this jira. After talking with someone on the Yahoo! side yesterday, it is pretty clear that the solution you guys are building will require trusted root to all nodes to be usable. Those of us that don't have such lax security in place will have to use a different solution anyway.

          Show
          Allen Wittenauer added a comment - Why can't it use the 'regular Hadoop RPC'? Altho at this point, I just don't care about this jira. After talking with someone on the Yahoo! side yesterday, it is pretty clear that the solution you guys are building will require trusted root to all nodes to be usable. Those of us that don't have such lax security in place will have to use a different solution anyway.
          Hide
          Devaraj Das added a comment -

          Also, where is the equivalent fix for the TaskTracker? Is anything preventing a TaskTracker from giving bad input to a task?

          Actually, all the RPCs are mutually authenticated via Sasl-Digest authentication. The shuffle communications are also mutually authenticated. So the Task<->TaskTracker communications are secured. In the datanode case, the data transfer protocol does not use the regular Hadoop RPC. Hence, we need to handle it differently..

          Show
          Devaraj Das added a comment - Also, where is the equivalent fix for the TaskTracker? Is anything preventing a TaskTracker from giving bad input to a task? Actually, all the RPCs are mutually authenticated via Sasl-Digest authentication. The shuffle communications are also mutually authenticated. So the Task<->TaskTracker communications are secured. In the datanode case, the data transfer protocol does not use the regular Hadoop RPC. Hence, we need to handle it differently..
          Hide
          Jitendra Nath Pandey added a comment -

          Updated build file for jsvc to use the checked in tar ball instead of downloading it.

          Show
          Jitendra Nath Pandey added a comment - Updated build file for jsvc to use the checked in tar ball instead of downloading it.
          Hide
          Jitendra Nath Pandey added a comment -

          It seems to be a better idea to check in the source tar for jsvc to avoid dowloading it and also to have control over the version we are using. The tar ball is attached.

          Show
          Jitendra Nath Pandey added a comment - It seems to be a better idea to check in the source tar for jsvc to avoid dowloading it and also to have control over the version we are using. The tar ball is attached.
          Hide
          Jitendra Nath Pandey added a comment -

          Fix on top of previous patches.
          This patch removes the shell script for building jsvc and uses ant tasks.

          Show
          Jitendra Nath Pandey added a comment - Fix on top of previous patches. This patch removes the shell script for building jsvc and uses ant tasks.
          Hide
          Devaraj Das added a comment -

          Fix on top of the earlier patches. Not for commit.

          Show
          Devaraj Das added a comment - Fix on top of the earlier patches. Not for commit.
          Hide
          Owen O'Malley added a comment -

          The data nodes are required to authenticate as the hdfs user to the name node.

          Show
          Owen O'Malley added a comment - The data nodes are required to authenticate as the hdfs user to the name node.
          Hide
          Philip Zeyliger added a comment -

          Hi Jakob,

          This may be a naive question because I'm out of date on the HDFS security stuff. What mechanism is preventing me from starting a rogue datanode and participating in the cluster?

          Thanks,

          – Philip

          Show
          Philip Zeyliger added a comment - Hi Jakob, This may be a naive question because I'm out of date on the HDFS security stuff. What mechanism is preventing me from starting a rogue datanode and participating in the cluster? Thanks, – Philip
          Hide
          Jakob Homan added a comment -

          Add some knobs to the secure datanode starter.

          Show
          Jakob Homan added a comment - Add some knobs to the secure datanode starter.
          Hide
          Devaraj Das added a comment -

          . Why can't the current setuid code launch the DataNode code as root, DataNode does uid detection, and then drops privs as necessary?

          This would have been ideal, except that in Java, we couldn't find a way of doing this. Do you have pointers to this kind of a thing being done in Java programs (other than the jsvc approach).

          Show
          Devaraj Das added a comment - . Why can't the current setuid code launch the DataNode code as root, DataNode does uid detection, and then drops privs as necessary? This would have been ideal, except that in Java, we couldn't find a way of doing this. Do you have pointers to this kind of a thing being done in Java programs (other than the jsvc approach).
          Hide
          Allen Wittenauer added a comment -

          Why can't the current setuid code launch the DataNode code as root, DataNode does uid detection, and then drops privs as necessary? This allows you to store the user name to run the DataNode process as in the XML, limits the dist to one setuid exec, fixes the necessity to run the hadoop script as root, etc, etc.

          Also, where is the equivalent fix for the TaskTracker? Is anything preventing a TaskTracker from giving bad input to a task?

          Show
          Allen Wittenauer added a comment - Why can't the current setuid code launch the DataNode code as root, DataNode does uid detection, and then drops privs as necessary? This allows you to store the user name to run the DataNode process as in the XML, limits the dist to one setuid exec, fixes the necessity to run the hadoop script as root, etc, etc. Also, where is the equivalent fix for the TaskTracker? Is anything preventing a TaskTracker from giving bad input to a task?
          Hide
          Steve Loughran added a comment -

          <get> is the ant task you are looking for to retrieve stuff, it's <untar> task handles gnu formats too

          Show
          Steve Loughran added a comment - <get> is the ant task you are looking for to retrieve stuff, it's <untar> task handles gnu formats too
          Hide
          Devaraj Das added a comment -

          Allen, thanks for the review. Some of the comments are valid. On why we are using jsvc, we didn't want to let the datanode run as root entirely. We wanted to only run whatever is minimally required as root - the code to bind to the privileged ports. After the bind happens, privileges are dropped. Jsvc handles these aspects quite well. The mapred setuid code cannot be used for this purpose since we need to deal with a Java datanode process..

          Show
          Devaraj Das added a comment - Allen, thanks for the review. Some of the comments are valid. On why we are using jsvc, we didn't want to let the datanode run as root entirely. We wanted to only run whatever is minimally required as root - the code to bind to the privileged ports. After the bind happens, privileges are dropped. Jsvc handles these aspects quite well. The mapred setuid code cannot be used for this purpose since we need to deal with a Java datanode process..
          Hide
          Allen Wittenauer added a comment -

          More problems:

          • hadoop shell script doesn't check to see if jsvc is present and executable
          • hadoop shell script is hard-coded to launch the datanode process as user 'hdfs'
          • hadoop shell script is hard-coded to use /dev/stderr and /dev/stdout which seems like a bad idea if something prevents the jsvc code from working properly

          I'm really confused as to how this is actually supposed to work in real world usage:

          • It appears the intention is that the hadoop command is going to be run by root based upon checking $EUID. This means a lot more is getting executed as root than just the java process. Why aren't we just re-using the mapred setuid code to launch the datanode process rather than having this dependency?
          • This doesn't look like it will work with start-mapred.sh/start-all.sh unless those are also run as root.
          Show
          Allen Wittenauer added a comment - More problems: hadoop shell script doesn't check to see if jsvc is present and executable hadoop shell script is hard-coded to launch the datanode process as user 'hdfs' hadoop shell script is hard-coded to use /dev/stderr and /dev/stdout which seems like a bad idea if something prevents the jsvc code from working properly I'm really confused as to how this is actually supposed to work in real world usage: It appears the intention is that the hadoop command is going to be run by root based upon checking $EUID. This means a lot more is getting executed as root than just the java process. Why aren't we just re-using the mapred setuid code to launch the datanode process rather than having this dependency? This doesn't look like it will work with start-mapred.sh/start-all.sh unless those are also run as root.
          Hide
          Allen Wittenauer added a comment -

          Also,

          • This patch will fail on Solaris and other operating systems that don't use GNU tar. It should use tar without the z option and pass to gzip to be more portable
          • it adds a requirement for wget as part of the build system which seems unnecessary. Shouldn't ivy or maven or one of these other fancy build tools be able to pull this code down?
          Show
          Allen Wittenauer added a comment - Also, This patch will fail on Solaris and other operating systems that don't use GNU tar. It should use tar without the z option and pass to gzip to be more portable it adds a requirement for wget as part of the build system which seems unnecessary. Shouldn't ivy or maven or one of these other fancy build tools be able to pull this code down?
          Hide
          Jakob Homan added a comment -

          It appears the assumption is that the attacker won't be able to get root privileges.

          This is indeed an assumption we've had for all the security work. Should one get root, they can get krb keytabs and at that point, game's over. This approach doesn't fix that assumption, but is consistent with it.

          Show
          Jakob Homan added a comment - It appears the assumption is that the attacker won't be able to get root privileges. This is indeed an assumption we've had for all the security work. Should one get root, they can get krb keytabs and at that point, game's over. This approach doesn't fix that assumption, but is consistent with it.
          Hide
          Allen Wittenauer added a comment -

          It appears the assumption is that the attacker won't be able to get root privileges. I don't think that's realistic for a complete fix and seems more of a security-through-obscurity approach. This is clearly a protocol problem and should be treated as such. Doing TCP port slight of hand isn't a real fix.

          Show
          Allen Wittenauer added a comment - It appears the assumption is that the attacker won't be able to get root privileges. I don't think that's realistic for a complete fix and seems more of a security-through-obscurity approach. This is clearly a protocol problem and should be treated as such. Doing TCP port slight of hand isn't a real fix.
          Hide
          Jakob Homan added a comment -

          Added license header to build.sh

          Show
          Jakob Homan added a comment - Added license header to build.sh
          Hide
          Jakob Homan added a comment -

          Final patch based on peer review from Jitendra and Devaraj. Fixed a couple javadoc issues and troublesome jdk reference. Tests look OK, test-patch has two false findbugs warnings from pre-existing errors that were refactored into new methods.

          Show
          Jakob Homan added a comment - Final patch based on peer review from Jitendra and Devaraj. Fixed a couple javadoc issues and troublesome jdk reference. Tests look OK, test-patch has two false findbugs warnings from pre-existing errors that were refactored into new methods.
          Hide
          Jakob Homan added a comment -

          Patch is ready to go, I believe. #6

          Show
          Jakob Homan added a comment - Patch is ready to go, I believe. #6
          Hide
          Jakob Homan added a comment -

          Patch where everything works for review. jsvc is behaving a bit odd - I can't get it to redirect stdout and stderr where they should go...

          Show
          Jakob Homan added a comment - Patch where everything works for review. jsvc is behaving a bit odd - I can't get it to redirect stdout and stderr where they should go...
          Hide
          Jakob Homan added a comment -

          How about one that actually compiles?

          Show
          Jakob Homan added a comment - How about one that actually compiles?
          Hide
          Jakob Homan added a comment -

          Merged with trunk and Jitendra's patch...

          Show
          Jakob Homan added a comment - Merged with trunk and Jitendra's patch...
          Hide
          Jakob Homan added a comment -

          Updated patch with correcty ivy lib...

          Show
          Jakob Homan added a comment - Updated patch with correcty ivy lib...
          Hide
          Jakob Homan added a comment -

          First, very rough patch against Y20S branch for secure datanode wrapper that uses jsvc/commons daemon to start up as root and then drop privs. Still need to modify shell scripts to use new class, as well as add to build file to compile jsvc (similar to MR's LinuxTaskController). Having trouble bringing in commons daemon via ivy... Not for commit, etc, etc.

          Show
          Jakob Homan added a comment - First, very rough patch against Y20S branch for secure datanode wrapper that uses jsvc/commons daemon to start up as root and then drop privs. Still need to modify shell scripts to use new class, as well as add to build file to compile jsvc (similar to MR's LinuxTaskController). Having trouble bringing in commons daemon via ivy... Not for commit, etc, etc.
          Hide
          Jakob Homan added a comment -

          The issue occurs when a client reads or writes blocks to a dn (and along the pipeline during writing). While the datanode is able to use the block access token to verify that the client has permission to do so, the client has no way of verifying that it's talking to a genuine datanode, rather than an imposter process that has come up on the datanode's port (say, after the datanode has crashed).

          To correct this, rather than change the DataTransferProtocol, and potentially introduce bugs into the write pipeline, we're looking at using the common's jsvc/daemon library to start a secure datanode as root, grab the necessary resources (eg privileged ports) and then drop privileges. This allows clients a reasonable certainty that the datanode they're talking to was started securely and is who they expect them to be.

          Show
          Jakob Homan added a comment - The issue occurs when a client reads or writes blocks to a dn (and along the pipeline during writing). While the datanode is able to use the block access token to verify that the client has permission to do so, the client has no way of verifying that it's talking to a genuine datanode, rather than an imposter process that has come up on the datanode's port (say, after the datanode has crashed). To correct this, rather than change the DataTransferProtocol, and potentially introduce bugs into the write pipeline, we're looking at using the common's jsvc/daemon library to start a secure datanode as root, grab the necessary resources (eg privileged ports) and then drop privileges. This allows clients a reasonable certainty that the datanode they're talking to was started securely and is who they expect them to be.

            People

            • Assignee:
              Jakob Homan
              Reporter:
              Jakob Homan
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development