Details

    • Type: Improvement Improvement
    • Status: In Progress
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.0.3
    • Fix Version/s: None
    • Component/s: security, webapps
    • Labels:
      None
    • Target Version/s:

      Description

      After investigating the methodology used to add HTTPS support in branch-2, I feel that this same approach should be back-ported to branch-1. I have taken many of the patches used for branch-2 and merged them in.

      I was working on top of HDP 1 at the time - I will provide a patch for trunk soon once I can confirm I am adding only the necessities for supporting HTTPS on the webUIs.

      As an added benefit – this patch actually provides HTTPS webUI to HBase by extension. If you take a hadoop-core jar compiled with this patch and put it into the hbase/lib directory and apply the necessary configs to hbase/conf.

      ========= OLD IDEA(s) BEHIND ADDING HTTPS (look @ Sept 17th patch) ==========

      In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

      Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.

      In order to facilitate this change I propose the following configuration additions:
      CONFIG PROPERTY -> DEFAULT VALUE
      mapred.https.enable -> false
      mapred.https.need.client.auth -> false
      mapred.https.server.keystore.resource -> "ssl-server.xml"
      mapred.job.tracker.https.port -> 50035
      mapred.job.tracker.https.address -> "<IP_ADDR>:50035"
      mapred.task.tracker.https.port -> 50065
      mapred.task.tracker.https.address -> "<IP_ADDR>:50065"

      I tested this on my local box after using keytool to generate a SSL certficate. You will need to change ssl-server.xml to point to the .keystore file after. Truststore may not be necessary; you can just point it to the keystore.

      1. branch-1.2-patch.txt
        192 kB
        Michael Weng
      2. branch-1.2-patch.txt2
        192 kB
        Michael Weng
      3. branch-1.2-patch.txt3
        192 kB
        Michael Weng
      4. branch-1.2-patch.txt4
        194 kB
        Michael Weng
      5. branch-1.2-patch.txt5
        194 kB
        Michael Weng
      6. branch-1.2-patch.txt6
        193 kB
        Michael Weng
      7. branch-1.2-patch.txt7
        193 kB
        Michael Weng
      8. MAPREDUCE-4461.patch
        4 kB
        Plamen Jeliazkov
      9. MAPREDUCE-4661.patch
        120 kB
        Plamen Jeliazkov
      10. MAPREDUCE-4661.patch
        123 kB
        Plamen Jeliazkov
      11. MAPREDUCE-4661.patch
        88 kB
        Plamen Jeliazkov

        Issue Links

          Activity

          Plamen Jeliazkov created issue -
          Hide
          Plamen Jeliazkov added a comment -

          Patch for review and comments.

          Show
          Plamen Jeliazkov added a comment - Patch for review and comments.
          Plamen Jeliazkov made changes -
          Field Original Value New Value
          Attachment MAPREDUCE-4461.patch [ 12545486 ]
          Plamen Jeliazkov made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12545486/MAPREDUCE-4461.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2859//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545486/MAPREDUCE-4461.patch against trunk revision . -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2859//console This message is automatically generated.
          Plamen Jeliazkov made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Plamen Jeliazkov made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Target Version/s 1.0.0 [ 12318240 ]
          Fix Version/s 1.1.0 [ 12317960 ]
          Plamen Jeliazkov made changes -
          Description In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

          Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.
          In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

          Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.

          In order to facilitate this change I propose the following configuration additions:
          CONFIG PROPERTY -> DEFAULT VALUE
          mapred.https.enable -> true
          mapred.https.need.client.auth -> false
          mapred.job.tracker.https.port -> 50035
          mapred.job.tracker.https.address -> IP:50035
          mapred.https.server.keystore.resource -> "ssl-server.xml"
          mapred.task.tracker.https.port -> 50065
          mapred.task.tracker.https.address -> IP:50065
          Plamen Jeliazkov made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Plamen Jeliazkov made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Plamen Jeliazkov made changes -
          Description In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

          Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.

          In order to facilitate this change I propose the following configuration additions:
          CONFIG PROPERTY -> DEFAULT VALUE
          mapred.https.enable -> true
          mapred.https.need.client.auth -> false
          mapred.job.tracker.https.port -> 50035
          mapred.job.tracker.https.address -> IP:50035
          mapred.https.server.keystore.resource -> "ssl-server.xml"
          mapred.task.tracker.https.port -> 50065
          mapred.task.tracker.https.address -> IP:50065
          In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

          Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.

          In order to facilitate this change I propose the following configuration additions:
          CONFIG PROPERTY -> DEFAULT VALUE
          mapred.https.enable -> true
          mapred.https.need.client.auth -> false
          mapred.https.server.keystore.resource -> "ssl-server.xml"
          mapred.job.tracker.https.port -> 50035
          mapred.job.tracker.https.address -> "<IP_ADDR>:50035"
          mapred.task.tracker.https.port -> 50065
          mapred.task.tracker.https.address -> "<IP_ADDR>:50065"

          I tested this on my local box after using keytool to generate a SSL certficate. You will need to change ssl-server.xml to point to the .keystore file after. Truststore may not be necessary; you can just point it to the keystore.
          Plamen Jeliazkov made changes -
          Affects Version/s 2.0.0-alpha [ 12320354 ]
          Plamen Jeliazkov made changes -
          Attachment MAPREDUCE-4461.patch [ 12545514 ]
          Plamen Jeliazkov made changes -
          Description In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

          Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.

          In order to facilitate this change I propose the following configuration additions:
          CONFIG PROPERTY -> DEFAULT VALUE
          mapred.https.enable -> true
          mapred.https.need.client.auth -> false
          mapred.https.server.keystore.resource -> "ssl-server.xml"
          mapred.job.tracker.https.port -> 50035
          mapred.job.tracker.https.address -> "<IP_ADDR>:50035"
          mapred.task.tracker.https.port -> 50065
          mapred.task.tracker.https.address -> "<IP_ADDR>:50065"

          I tested this on my local box after using keytool to generate a SSL certficate. You will need to change ssl-server.xml to point to the .keystore file after. Truststore may not be necessary; you can just point it to the keystore.
          In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

          Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.

          In order to facilitate this change I propose the following configuration additions:
          CONFIG PROPERTY -> DEFAULT VALUE
          mapred.https.enable -> false
          mapred.https.need.client.auth -> false
          mapred.https.server.keystore.resource -> "ssl-server.xml"
          mapred.job.tracker.https.port -> 50035
          mapred.job.tracker.https.address -> "<IP_ADDR>:50035"
          mapred.task.tracker.https.port -> 50065
          mapred.task.tracker.https.address -> "<IP_ADDR>:50065"

          I tested this on my local box after using keytool to generate a SSL certficate. You will need to change ssl-server.xml to point to the .keystore file after. Truststore may not be necessary; you can just point it to the keystore.
          Plamen Jeliazkov made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Plamen Jeliazkov made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12545514/MAPREDUCE-4461.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2861//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545514/MAPREDUCE-4461.patch against trunk revision . -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2861//console This message is automatically generated.
          Plamen Jeliazkov made changes -
          Attachment MAPREDUCE-4461.patch [ 12545486 ]
          Hide
          Plamen Jeliazkov added a comment -

          I am aware there is a patch in branch-2.
          Namely, https://issues.apache.org/jira/browse/HADOOP-8581.

          I guess I would like this back-ported in branch-1 as well; however there appears to be a lot of work that needs to be done to do so. Is it necessary to grab everything from this patch? Is a backport possible?

          Show
          Plamen Jeliazkov added a comment - I am aware there is a patch in branch-2. Namely, https://issues.apache.org/jira/browse/HADOOP-8581 . I guess I would like this back-ported in branch-1 as well; however there appears to be a lot of work that needs to be done to do so. Is it necessary to grab everything from this patch? Is a backport possible?
          Plamen Jeliazkov made changes -
          Link This issue duplicates HADOOP-8581 [ HADOOP-8581 ]
          Hide
          Alejandro Abdelnur added a comment -

          you'd need the sslfactory stuff from MAPREDUCE-4417 (there is a patch for branch-1 which as not been committed, see JIRA for details) and then you'll have to tweak JSPs and a few other places to use the HttpConfig from HADOOP-8581 to create the URLs. Also, in Hadoop 1 the HttpServer is shared between shuffle and the webui, so you'll have to make sure you use 2 connectors, one SSL for the webui and one clear for shuffle, for all the webui requests you have to ensure they are not served over the clear connector (shuffle's), you could do this with a filter.

          Show
          Alejandro Abdelnur added a comment - you'd need the sslfactory stuff from MAPREDUCE-4417 (there is a patch for branch-1 which as not been committed, see JIRA for details) and then you'll have to tweak JSPs and a few other places to use the HttpConfig from HADOOP-8581 to create the URLs. Also, in Hadoop 1 the HttpServer is shared between shuffle and the webui, so you'll have to make sure you use 2 connectors, one SSL for the webui and one clear for shuffle, for all the webui requests you have to ensure they are not served over the clear connector (shuffle's), you could do this with a filter.
          Hide
          Plamen Jeliazkov added a comment -

          Thank you Alejandro. I will begin working on this again.

          Show
          Plamen Jeliazkov added a comment - Thank you Alejandro. I will begin working on this again.
          Plamen Jeliazkov made changes -
          Fix Version/s 1.0.4 [ 12323325 ]
          Affects Version/s 1.0.3 [ 12320250 ]
          Affects Version/s 1.0.0 [ 12318240 ]
          Affects Version/s 2.0.0-alpha [ 12320354 ]
          Target Version/s 1.0.0 [ 12318240 ] 1.0.3 [ 12320250 ]
          Component/s webapps [ 12316700 ]
          Hide
          Plamen Jeliazkov added a comment -

          This is my most recent work of back-porting various patches into Hadoop 1.0.3 in order to get HTTPS working on all the webUIs.

          There was a conflict between dfs.https.enabled and the hadoop.ssl.enabled settings caused issue in bringing up the DFS webUIs (NameNode mostly).

          I have made it work in this patch. A lot of files had to be touched to make it work. At this moment I can see NameNode, JobTracker, and TaskTracker webUIs in HTTPS and not HTTP.

          This patch does not address certain hard-coded HTTP urls within the webUIs themselves. Hopefully another patch that I put out shortly will fix that.

          Show
          Plamen Jeliazkov added a comment - This is my most recent work of back-porting various patches into Hadoop 1.0.3 in order to get HTTPS working on all the webUIs. There was a conflict between dfs.https.enabled and the hadoop.ssl.enabled settings caused issue in bringing up the DFS webUIs (NameNode mostly). I have made it work in this patch. A lot of files had to be touched to make it work. At this moment I can see NameNode, JobTracker, and TaskTracker webUIs in HTTPS and not HTTP. This patch does not address certain hard-coded HTTP urls within the webUIs themselves. Hopefully another patch that I put out shortly will fix that.
          Plamen Jeliazkov made changes -
          Attachment MAPREDUCE-4661.patch [ 12548296 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12548296/MAPREDUCE-4661.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2915//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548296/MAPREDUCE-4661.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2915//console This message is automatically generated.
          Plamen Jeliazkov made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Plamen Jeliazkov added a comment -

          This is actually a far better / comprehensive patch then I previously posted. The JSP pages still need to be fixed but it is almost complete! Some of the JSP pages are already done like nn_browsedfscontent and browseDirectory.

          I will post a "complete" patch later.

          Show
          Plamen Jeliazkov added a comment - This is actually a far better / comprehensive patch then I previously posted. The JSP pages still need to be fixed but it is almost complete! Some of the JSP pages are already done like nn_browsedfscontent and browseDirectory. I will post a "complete" patch later.
          Plamen Jeliazkov made changes -
          Attachment https.patch [ 12548342 ]
          Hide
          Plamen Jeliazkov added a comment -

          This latest patch removes ALOT of the unrelated code. It is focused on just the HTTPS of the webUIs. I can confirm it compiling on top of HDP 1 currently. I will create a patch for trunk once I can validate with some testing that this patch works.

          Show
          Plamen Jeliazkov added a comment - This latest patch removes ALOT of the unrelated code. It is focused on just the HTTPS of the webUIs. I can confirm it compiling on top of HDP 1 currently. I will create a patch for trunk once I can validate with some testing that this patch works.
          Plamen Jeliazkov made changes -
          Attachment MAPREDUCE-4661.patch [ 12548474 ]
          Hide
          Plamen Jeliazkov added a comment -

          Latest patch for review. This applies cleanly on top of HDP 1 and has been partially reviewed by Benoy.

          I would like some open source reviews before I go on to create patches for trunk & etc.

          Show
          Plamen Jeliazkov added a comment - Latest patch for review. This applies cleanly on top of HDP 1 and has been partially reviewed by Benoy. I would like some open source reviews before I go on to create patches for trunk & etc.
          Plamen Jeliazkov made changes -
          Attachment MAPREDUCE-4661.patch [ 12548803 ]
          Plamen Jeliazkov made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Plamen Jeliazkov made changes -
          Summary Add HTTPS for JobTracker and TaskTracker Add HTTPS for WebUIs on Branch-1
          Plamen Jeliazkov made changes -
          Description In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

          Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.

          In order to facilitate this change I propose the following configuration additions:
          CONFIG PROPERTY -> DEFAULT VALUE
          mapred.https.enable -> false
          mapred.https.need.client.auth -> false
          mapred.https.server.keystore.resource -> "ssl-server.xml"
          mapred.job.tracker.https.port -> 50035
          mapred.job.tracker.https.address -> "<IP_ADDR>:50035"
          mapred.task.tracker.https.port -> 50065
          mapred.task.tracker.https.address -> "<IP_ADDR>:50065"

          I tested this on my local box after using keytool to generate a SSL certficate. You will need to change ssl-server.xml to point to the .keystore file after. Truststore may not be necessary; you can just point it to the keystore.
          After investigating the methodology used to add HTTPS support in branch-2, I feel that this same approach should be back-ported to branch-1. I have taken many of the patches used for branch-2 and merged them in.

          I was working on top of HDP 1 at the time - I will provide a patch for trunk soon once I can confirm I am adding only the necessities for supporting HTTPS on the webUIs.

          ========= OLD IDEA(s) BEHIND ADDING HTTPS (look @ Sept 17th patch) ==========

          In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

          Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.

          In order to facilitate this change I propose the following configuration additions:
          CONFIG PROPERTY -> DEFAULT VALUE
          mapred.https.enable -> false
          mapred.https.need.client.auth -> false
          mapred.https.server.keystore.resource -> "ssl-server.xml"
          mapred.job.tracker.https.port -> 50035
          mapred.job.tracker.https.address -> "<IP_ADDR>:50035"
          mapred.task.tracker.https.port -> 50065
          mapred.task.tracker.https.address -> "<IP_ADDR>:50065"

          I tested this on my local box after using keytool to generate a SSL certficate. You will need to change ssl-server.xml to point to the .keystore file after. Truststore may not be necessary; you can just point it to the keystore.
          Plamen Jeliazkov made changes -
          Description After investigating the methodology used to add HTTPS support in branch-2, I feel that this same approach should be back-ported to branch-1. I have taken many of the patches used for branch-2 and merged them in.

          I was working on top of HDP 1 at the time - I will provide a patch for trunk soon once I can confirm I am adding only the necessities for supporting HTTPS on the webUIs.

          ========= OLD IDEA(s) BEHIND ADDING HTTPS (look @ Sept 17th patch) ==========

          In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

          Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.

          In order to facilitate this change I propose the following configuration additions:
          CONFIG PROPERTY -> DEFAULT VALUE
          mapred.https.enable -> false
          mapred.https.need.client.auth -> false
          mapred.https.server.keystore.resource -> "ssl-server.xml"
          mapred.job.tracker.https.port -> 50035
          mapred.job.tracker.https.address -> "<IP_ADDR>:50035"
          mapred.task.tracker.https.port -> 50065
          mapred.task.tracker.https.address -> "<IP_ADDR>:50065"

          I tested this on my local box after using keytool to generate a SSL certficate. You will need to change ssl-server.xml to point to the .keystore file after. Truststore may not be necessary; you can just point it to the keystore.
          After investigating the methodology used to add HTTPS support in branch-2, I feel that this same approach should be back-ported to branch-1. I have taken many of the patches used for branch-2 and merged them in.

          I was working on top of HDP 1 at the time - I will provide a patch for trunk soon once I can confirm I am adding only the necessities for supporting HTTPS on the webUIs.

          As an added benefit -- this patch actually provides HTTPS webUI to HBase by extension. If you take a hadoop-core jar compiled with this patch and put it into the hbase/lib directory and apply the necessary configs to hbase/conf.

          ========= OLD IDEA(s) BEHIND ADDING HTTPS (look @ Sept 17th patch) ==========

          In order to provide full security around the cluster, the webUI should also be secure if desired to prevent cookie theft and user masquerading.

          Here is my proposed work. Currently I can only add HTTPS support. I do not know how to switch reliance of the HttpServer from HTTP to HTTPS fully.

          In order to facilitate this change I propose the following configuration additions:
          CONFIG PROPERTY -> DEFAULT VALUE
          mapred.https.enable -> false
          mapred.https.need.client.auth -> false
          mapred.https.server.keystore.resource -> "ssl-server.xml"
          mapred.job.tracker.https.port -> 50035
          mapred.job.tracker.https.address -> "<IP_ADDR>:50035"
          mapred.task.tracker.https.port -> 50065
          mapred.task.tracker.https.address -> "<IP_ADDR>:50065"

          I tested this on my local box after using keytool to generate a SSL certficate. You will need to change ssl-server.xml to point to the .keystore file after. Truststore may not be necessary; you can just point it to the keystore.
          Plamen Jeliazkov made changes -
          Component/s security [ 12313041 ]
          Hide
          Benoy Antony added a comment -

          Would you be able to provide a patch for Hadoop 1 ?

          Show
          Benoy Antony added a comment - Would you be able to provide a patch for Hadoop 1 ?
          Hide
          Matt Foley added a comment -

          1.0.4 is released now, and should probably be the last 1.0 version.
          1.1.0 is released now also.
          This change could be targetted at either 1.1.1 or 1.2.0. My guess is it is a big enough change it should go in 1.2.0, so that's what I marked it for.

          Show
          Matt Foley added a comment - 1.0.4 is released now, and should probably be the last 1.0 version. 1.1.0 is released now also. This change could be targetted at either 1.1.1 or 1.2.0. My guess is it is a big enough change it should go in 1.2.0, so that's what I marked it for.
          Matt Foley made changes -
          Fix Version/s 1.0.4 [ 12323325 ]
          Target Version/s 1.0.3 [ 12320250 ] 1.2.0 [ 12321661 ]
          Hide
          Owen O'Malley added a comment -

          Please fix up:

          • remove the config change to:
            • fs.default.name
            • hdfs-site.xml
            • mapred-site.xml
            • ssl.*.location
            • ssl.*.password
          • the default value of hadoop.ssl.enabled must be false
          • remove the spurious change to InterTrackerProtocol.java and other changes related to disk failures
          • remove the spurious whitespace changes
          • downgrade the httpserver logging to debug

          Have you tested all of the combinations of hadoop.ssl.enabled and mapreduce.shuffle.ssl.enabled? What is the use case where the two values will differ?

          Show
          Owen O'Malley added a comment - Please fix up: remove the config change to: fs.default.name hdfs-site.xml mapred-site.xml ssl.*.location ssl.*.password the default value of hadoop.ssl.enabled must be false remove the spurious change to InterTrackerProtocol.java and other changes related to disk failures remove the spurious whitespace changes downgrade the httpserver logging to debug Have you tested all of the combinations of hadoop.ssl.enabled and mapreduce.shuffle.ssl.enabled? What is the use case where the two values will differ?
          Benoy Antony made changes -
          Link This issue is depended upon by HDFS-4108 [ HDFS-4108 ]
          Hide
          Plamen Jeliazkov added a comment -

          Hi Owen,

          I apologize for the length of silence. I will go ahead and take action to your comments and generate a new patch.
          Benoy has discovered some issues with submitting a job using my patch and enabling HTTPS, and an interesting "NoSuchMethodError" with using my patch but without enabling HTTPS.

          We spoke off-line about how I removed the MapReduce SSL shuffle code; most likely there is somewhere within the code that still relies on SSL for job submission when HTTPS is enabled. Benoy and I will be working on these issues, I will then apply your comments to the patch and upload it soon.

          It appears I should also modify my code for 1.2.0 as well.

          Show
          Plamen Jeliazkov added a comment - Hi Owen, I apologize for the length of silence. I will go ahead and take action to your comments and generate a new patch. Benoy has discovered some issues with submitting a job using my patch and enabling HTTPS, and an interesting "NoSuchMethodError" with using my patch but without enabling HTTPS. We spoke off-line about how I removed the MapReduce SSL shuffle code; most likely there is somewhere within the code that still relies on SSL for job submission when HTTPS is enabled. Benoy and I will be working on these issues, I will then apply your comments to the patch and upload it soon. It appears I should also modify my code for 1.2.0 as well.
          Plamen Jeliazkov made changes -
          Attachment https.patch [ 12548342 ]
          Benoy Antony made changes -
          Link This issue is depended upon by HDFS-4108 [ HDFS-4108 ]
          Michael Weng made changes -
          Assignee Plamen Jeliazkov [ zero45 ] Michael Weng [ michaelweng ]
          Hide
          Michael Weng added a comment -

          Continue Plamen's work. Here is the updating patch for branch 1.2.

          Show
          Michael Weng added a comment - Continue Plamen's work. Here is the updating patch for branch 1.2.
          Michael Weng made changes -
          Attachment branch-1.2-patch.txt [ 12574623 ]
          Hide
          Michael Weng added a comment -

          Found error in WebHdfsFileSystem.java and NamenodeWebHdfsMethods.java. Patch updated for the fixes.

          Show
          Michael Weng added a comment - Found error in WebHdfsFileSystem.java and NamenodeWebHdfsMethods.java. Patch updated for the fixes.
          Michael Weng made changes -
          Attachment branch-1.2-patch.txt2 [ 12574860 ]
          Michael Weng made changes -
          Link This issue blocks HBASE-8181 [ HBASE-8181 ]
          Gavin made changes -
          Link This issue blocks HBASE-8181 [ HBASE-8181 ]
          Gavin made changes -
          Link This issue is depended upon by HBASE-8181 [ HBASE-8181 ]
          Hide
          Matt Foley added a comment -

          Changed Target Version to 1.3.0 upon release of 1.2.0. Please change to 1.2.1 if you intend to submit a fix for branch-1.2.

          Show
          Matt Foley added a comment - Changed Target Version to 1.3.0 upon release of 1.2.0. Please change to 1.2.1 if you intend to submit a fix for branch-1.2.
          Matt Foley made changes -
          Target Version/s 1.2.0 [ 12321661 ] 1.3.0 [ 12324153 ]
          Hide
          Michael Weng added a comment -

          Updated patch for 1.2.1.

          Show
          Michael Weng added a comment - Updated patch for 1.2.1.
          Michael Weng made changes -
          Attachment branch-1.2-patch.txt3 [ 12584060 ]
          Hide
          Devaraj Das added a comment -

          Going through the patch. Some quick questions & comments:
          1. Seems like the corresponding code in the trunk has moved some. For example, FileBasedKeyStoresFactory.java has some updates. The question is whether we should update the branch-1 patch accordingly. Maybe we should?
          2. src/test/org/apache/hadoop/http/TestSSLHttpServer.java has some commented out code, and that is also different (although maybe cosmetically) than trunk's.

          I'll go through some more and might have some more questions. How much testing has the patch seen (unit tests & manual)?

          Show
          Devaraj Das added a comment - Going through the patch. Some quick questions & comments: 1. Seems like the corresponding code in the trunk has moved some. For example, FileBasedKeyStoresFactory.java has some updates. The question is whether we should update the branch-1 patch accordingly. Maybe we should? 2. src/test/org/apache/hadoop/http/TestSSLHttpServer.java has some commented out code, and that is also different (although maybe cosmetically) than trunk's. I'll go through some more and might have some more questions. How much testing has the patch seen (unit tests & manual)?
          Hide
          Michael Weng added a comment -

          Thanks for the comments.

          I have pulled the new version of FileBasedKeyStoresFactory.java and TestSSLHttpServer.java from hadoop 2. In corresponding to the changes, there are the files to be updated.

          1. modified: src/core/org/apache/hadoop/http/HttpConfig.java
          2. modified: src/core/org/apache/hadoop/http/HttpServer.java
          3. modified: src/core/org/apache/hadoop/security/ssl/FileBasedKeyStoresFactory.java
          4. modified: src/core/org/apache/hadoop/security/ssl/SSLFactory.java
          5. modified: src/core/org/apache/hadoop/util/PlatformName.java
          6. modified: src/test/org/apache/hadoop/http/TestSSLHttpServer.java

          I do need to remove the use of com.google.common.annotations.VisibleForTesting. Will provide the new patch soon.

          Show
          Michael Weng added a comment - Thanks for the comments. I have pulled the new version of FileBasedKeyStoresFactory.java and TestSSLHttpServer.java from hadoop 2. In corresponding to the changes, there are the files to be updated. modified: src/core/org/apache/hadoop/http/HttpConfig.java modified: src/core/org/apache/hadoop/http/HttpServer.java modified: src/core/org/apache/hadoop/security/ssl/FileBasedKeyStoresFactory.java modified: src/core/org/apache/hadoop/security/ssl/SSLFactory.java modified: src/core/org/apache/hadoop/util/PlatformName.java modified: src/test/org/apache/hadoop/http/TestSSLHttpServer.java I do need to remove the use of com.google.common.annotations.VisibleForTesting. Will provide the new patch soon.
          Hide
          Michael Weng added a comment -

          Tested:
          Full unit tests during compilation. There are a couple or a few failures that I think it’s not related to the change. For the system tests, I had it on a 5-machine VM cluster and then a 60-machine real cluster, both with security enabled. Many sample operations being done. Also tested the case to turn https off in the config. SecondaryNameNode was on during testing, also verify download/upload of fsimage.

          Show
          Michael Weng added a comment - Tested: Full unit tests during compilation. There are a couple or a few failures that I think it’s not related to the change. For the system tests, I had it on a 5-machine VM cluster and then a 60-machine real cluster, both with security enabled. Many sample operations being done. Also tested the case to turn https off in the config. SecondaryNameNode was on during testing, also verify download/upload of fsimage.
          Hide
          Michael Weng added a comment -

          New patch attached.

          Show
          Michael Weng added a comment - New patch attached.
          Michael Weng made changes -
          Attachment branch-1.2-patch.txt4 [ 12592433 ]
          Hide
          Devaraj Das added a comment -

          Some comments:
          1. HTTP_MAX_THREADS is not used in the patch. It should be used in the HttpServer's constructor in the create of the QueuedThreadPool.
          2. In TaskTrackerStatus.java - could we have a new constructor with the new shufflePort argument (and in the old constructor have the value of shufflePort default to the httpPort.
          3. getFallBackAuthenticator implementation in KerberosAuthenticator needs to set the configurator in the PseudoAuthenticator instance before returning.
          4. In DataNode.java, remove the check for isSecure() in the constructor.

          On the testing front, please ensure things like SecondaryNamenode<->PrimaryNamenode communication, distcp, continue to work as usual.. Also, paste the result of test-patch and unit test runs.

          Show
          Devaraj Das added a comment - Some comments: 1. HTTP_MAX_THREADS is not used in the patch. It should be used in the HttpServer's constructor in the create of the QueuedThreadPool. 2. In TaskTrackerStatus.java - could we have a new constructor with the new shufflePort argument (and in the old constructor have the value of shufflePort default to the httpPort. 3. getFallBackAuthenticator implementation in KerberosAuthenticator needs to set the configurator in the PseudoAuthenticator instance before returning. 4. In DataNode.java, remove the check for isSecure() in the constructor. On the testing front, please ensure things like SecondaryNamenode<->PrimaryNamenode communication, distcp, continue to work as usual.. Also, paste the result of test-patch and unit test runs.
          Hide
          Michael Weng added a comment -

          Thanks Devaraj.

          Here is the new patch with changes mentioned in your comments. Files being touched are:

          1. modified: src/core/org/apache/hadoop/http/HttpServer.java
          2. modified: src/core/org/apache/hadoop/security/authentication/client/KerberosAuthenticator.java
          3. modified: src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          4. modified: src/mapred/org/apache/hadoop/mapred/TaskTrackerStatus.java

          And BTW, how can I get the unit test result? Copy and paste from terminal output or there is a different way? This is the command I used to run the unit test.

          ant -Dforrest.home=$FORREST_HOME -Djava5.home=$JAVA5_HOME -Dcompile.c++=true -Dcompile.native=true clean test

          Show
          Michael Weng added a comment - Thanks Devaraj. Here is the new patch with changes mentioned in your comments. Files being touched are: modified: src/core/org/apache/hadoop/http/HttpServer.java modified: src/core/org/apache/hadoop/security/authentication/client/KerberosAuthenticator.java modified: src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java modified: src/mapred/org/apache/hadoop/mapred/TaskTrackerStatus.java And BTW, how can I get the unit test result? Copy and paste from terminal output or there is a different way? This is the command I used to run the unit test. ant -Dforrest.home=$FORREST_HOME -Djava5.home=$JAVA5_HOME -Dcompile.c++=true -Dcompile.native=true clean test
          Michael Weng made changes -
          Attachment branch-1.2-patch.txt5 [ 12593258 ]
          Hide
          Michael Weng added a comment -

          Fixed tasklog url and SN for HttpServer on running as daemon. Following is the change compared to the previous patch.
          -----------
          diff --git a/src/core/org/apache/hadoop/http/HttpServer.java b/src/core/org/apache/hadoop/http/HttpServer.ja
          index 0047d64..efcaad6 100644
          — a/src/core/org/apache/hadoop/http/HttpServer.java
          +++ b/src/core/org/apache/hadoop/http/HttpServer.java
          @@ -167,7 +167,6 @@ public class HttpServer implements FilterContainer {
          // default value (currently 250).
          QueuedThreadPool threadPool = maxThreads == -1 ?
          new QueuedThreadPool() : new QueuedThreadPool(maxThreads);

          • threadPool.setDaemon(true);
            webServer.setThreadPool(threadPool);

          final String appDir = getWebAppsPath();
          diff --git a/src/mapred/org/apache/hadoop/mapred/JobHistory.java b/src/mapred/org/apache/hadoop/mapred/JobHi
          index 4ba2e38..9d701f5 100644
          — a/src/mapred/org/apache/hadoop/mapred/JobHistory.java
          +++ b/src/mapred/org/apache/hadoop/mapred/JobHistory.java
          @@ -2787,7 +2787,7 @@ public class JobHistory {

          • task-attempt-id are unavailable.
            */
            public static String getTaskLogsUrl(JobHistory.TaskAttempt attempt) {
          • if (attempt.get(Keys.SHUFFLE_PORT).equals("")
            + if (attempt.get(Keys.HTTP_PORT).equals("")
            attempt.get(Keys.TRACKER_NAME).equals("")
            attempt.get(Keys.TASK_ATTEMPT_ID).equals(""))
            Unknown macro: { return null;@@ -2797,6 +2797,6 @@ public class JobHistory { JobInProgress.convertTrackerNameToHostName( attempt.get(Keys.TRACKER_NAME)); return TaskLogServlet.getTaskLogUrl(taskTrackerName, attempt - .get(Keys.SHUFFLE_PORT), attempt.get(Keys.TASK_ATTEMPT_ID)); + .get(Keys.HTTP_PORT), attempt.get(Keys.TASK_ATTEMPT_ID)); } }

            -----------

          Also attached the new patch.

          Show
          Michael Weng added a comment - Fixed tasklog url and SN for HttpServer on running as daemon. Following is the change compared to the previous patch. ----------- diff --git a/src/core/org/apache/hadoop/http/HttpServer.java b/src/core/org/apache/hadoop/http/HttpServer.ja index 0047d64..efcaad6 100644 — a/src/core/org/apache/hadoop/http/HttpServer.java +++ b/src/core/org/apache/hadoop/http/HttpServer.java @@ -167,7 +167,6 @@ public class HttpServer implements FilterContainer { // default value (currently 250). QueuedThreadPool threadPool = maxThreads == -1 ? new QueuedThreadPool() : new QueuedThreadPool(maxThreads); threadPool.setDaemon(true); webServer.setThreadPool(threadPool); final String appDir = getWebAppsPath(); diff --git a/src/mapred/org/apache/hadoop/mapred/JobHistory.java b/src/mapred/org/apache/hadoop/mapred/JobHi index 4ba2e38..9d701f5 100644 — a/src/mapred/org/apache/hadoop/mapred/JobHistory.java +++ b/src/mapred/org/apache/hadoop/mapred/JobHistory.java @@ -2787,7 +2787,7 @@ public class JobHistory { task-attempt-id are unavailable. */ public static String getTaskLogsUrl(JobHistory.TaskAttempt attempt) { if (attempt.get(Keys.SHUFFLE_PORT).equals("") + if (attempt.get(Keys.HTTP_PORT).equals("") attempt.get(Keys.TRACKER_NAME).equals("") attempt.get(Keys.TASK_ATTEMPT_ID).equals("")) Unknown macro: { return null;@@ -2797,6 +2797,6 @@ public class JobHistory { JobInProgress.convertTrackerNameToHostName( attempt.get(Keys.TRACKER_NAME)); return TaskLogServlet.getTaskLogUrl(taskTrackerName, attempt - .get(Keys.SHUFFLE_PORT), attempt.get(Keys.TASK_ATTEMPT_ID)); + .get(Keys.HTTP_PORT), attempt.get(Keys.TASK_ATTEMPT_ID)); } } ----------- Also attached the new patch.
          Michael Weng made changes -
          Attachment branch-1.2-patch.txt6 [ 12596246 ]
          Hide
          Michael Weng added a comment -

          Running on our large production cluster for more than one week.

          Some unit test failures that don't seem to be related to the change.
          ---------
          [junit] Test org.apache.hadoop.io.compress.TestCodec FAILED
          [junit] Test org.apache.hadoop.fs.TestFsShellReturnCode FAILED
          [junit] Test org.apache.hadoop.hdfs.TestFileCreation FAILED
          [junit] Test org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark FAILED
          [junit] Test org.apache.hadoop.mapred.TestJobHistory FAILED
          [junit] Test org.apache.hadoop.mapred.TestLostTracker FAILED
          ---------

          Show
          Michael Weng added a comment - Running on our large production cluster for more than one week. Some unit test failures that don't seem to be related to the change. --------- [junit] Test org.apache.hadoop.io.compress.TestCodec FAILED [junit] Test org.apache.hadoop.fs.TestFsShellReturnCode FAILED [junit] Test org.apache.hadoop.hdfs.TestFileCreation FAILED [junit] Test org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark FAILED [junit] Test org.apache.hadoop.mapred.TestJobHistory FAILED [junit] Test org.apache.hadoop.mapred.TestLostTracker FAILED ---------
          Hide
          Michael Weng added a comment -

          Found an error in JobHistory.java that breaks TestJobHistory and TestLostTracker. New patch is attached. Changes compared with previous patch:
          -------

          • Keys.TRACKER_NAME, Keys.HTTP_PORT,
            + Keys.TRACKER_NAME, Keys.HTTP_PORT, Keys.SHUFFLE_PORT,
          • * @return the taskLogsUrl. null if shuffle-port or tracker-name or
            + * @return the taskLogsUrl. null if http-port or tracker-name or
            -------

          TestFileCreation and TestNNThroughputBenchmark are passed on individual testcase run after cleaning up.

          The following two testcases are failed with and without the changes in the patch.

          [junit] Test org.apache.hadoop.io.compress.TestCodec FAILED
          [junit] Test org.apache.hadoop.fs.TestFsShellReturnCode FAILED

          Show
          Michael Weng added a comment - Found an error in JobHistory.java that breaks TestJobHistory and TestLostTracker. New patch is attached. Changes compared with previous patch: ------- Keys.TRACKER_NAME, Keys.HTTP_PORT, + Keys.TRACKER_NAME, Keys.HTTP_PORT, Keys.SHUFFLE_PORT, * @return the taskLogsUrl. null if shuffle-port or tracker-name or + * @return the taskLogsUrl. null if http-port or tracker-name or ------- TestFileCreation and TestNNThroughputBenchmark are passed on individual testcase run after cleaning up. The following two testcases are failed with and without the changes in the patch. [junit] Test org.apache.hadoop.io.compress.TestCodec FAILED [junit] Test org.apache.hadoop.fs.TestFsShellReturnCode FAILED
          Michael Weng made changes -
          Attachment branch-1.2-patch.txt7 [ 12596479 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          5m 56s 4 Plamen Jeliazkov 18/Sep/12 02:40
          Patch Available Patch Available Open Open
          21d 2h 24m 4 Plamen Jeliazkov 09/Oct/12 02:10
          Open Open In Progress In Progress
          2d 19h 42m 1 Plamen Jeliazkov 11/Oct/12 21:52

            People

            • Assignee:
              Michael Weng
              Reporter:
              Plamen Jeliazkov
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:

                Development