Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: security
    • Labels:
      None

      Description

      Since the user's data is available via http from the TaskTrackers, we should require a job-specific secret to access it.

      1. MR-1026-0_20.2.patch
        34 kB
        Jitendra Nath Pandey
      2. MAPREDUCE-1026-9.patch
        37 kB
        Boris Shkolnik
      3. MAPREDUCE-1026-7.patch
        38 kB
        Boris Shkolnik
      4. MAPREDUCE-1026-3.patch
        28 kB
        Boris Shkolnik
      5. MAPREDUCE-1026-2.patch
        29 kB
        Boris Shkolnik
      6. MAPREDUCE-1026-15.patch
        38 kB
        Boris Shkolnik
      7. MAPREDUCE-1026-14.patch
        38 kB
        Boris Shkolnik
      8. MAPREDUCE-1026-13.patch
        38 kB
        Boris Shkolnik
      9. MAPREDUCE-1026-12.patch
        38 kB
        Boris Shkolnik
      10. MAPREDUCE-1026-1.patch
        24 kB
        Boris Shkolnik
      11. MAPREDUCE-1026.patch
        30 kB
        Boris Shkolnik
      12. MAPREDUCE-1026.patch
        29 kB
        Boris Shkolnik
      13. 1026-bp20-bugfix.patch
        5 kB
        Devaraj Das

        Issue Links

          Activity

          Hide
          Owen O'Malley added a comment -

          The JobClient should create a random key of 10 characters from [a-zA-Z0-9] and put it in the job conf as secret.mapred.job.shuffle.key. I'd propose that we add all secret keys in a sub-tree of the config key space (secret.*) so that the web ui can hide them. The reducer can include the key in the url and the TaskTracker can check to make sure it is correct.

          Show
          Owen O'Malley added a comment - The JobClient should create a random key of 10 characters from [a-zA-Z0-9] and put it in the job conf as secret.mapred.job.shuffle.key. I'd propose that we add all secret keys in a sub-tree of the config key space (secret.*) so that the web ui can hide them. The reducer can include the key in the url and the TaskTracker can check to make sure it is correct.
          Hide
          Jeff Hammerbacher added a comment -

          Hey Owen (and probably Doug),

          While we're here: how would this strategy change if map output was transferred to the reducers using Avro's RPC? Is there authentication in the handshake, and encryption (ssl?) for the data?

          Just trying to educate myself for The Future (tm).

          Thanks,
          Jeff

          Show
          Jeff Hammerbacher added a comment - Hey Owen (and probably Doug), While we're here: how would this strategy change if map output was transferred to the reducers using Avro's RPC? Is there authentication in the handshake, and encryption (ssl?) for the data? Just trying to educate myself for The Future (tm). Thanks, Jeff
          Hide
          Allen Wittenauer added a comment -

          > 10 characters from [a-zA-Z0-9]

          This seems like a fairly small key space that one could Hadoop on a small cluster to break. Why not just use MD5 or SHA1 128 or 256 bit keys?

          Show
          Allen Wittenauer added a comment - > 10 characters from [a-zA-Z0-9] This seems like a fairly small key space that one could Hadoop on a small cluster to break. Why not just use MD5 or SHA1 128 or 256 bit keys?
          Hide
          Owen O'Malley added a comment -

          Avro RPC won't have bulk data or authentication for a while, I suspect.

          But the answer is yes, once there is authentication on the rpc, we can use that. In particular, the rpc will be able to use token/secret keys for authentication and that would be appropriate for this context. (Clearly a key exchange involving the kdc would never be performant enough for the shuffle.)

          Show
          Owen O'Malley added a comment - Avro RPC won't have bulk data or authentication for a while, I suspect. But the answer is yes, once there is authentication on the rpc, we can use that. In particular, the rpc will be able to use token/secret keys for authentication and that would be appropriate for this context. (Clearly a key exchange involving the kdc would never be performant enough for the shuffle.)
          Hide
          Owen O'Malley added a comment -

          I just wanted to get a proposal out there. 66^10 is very big. It is roughly 2^60.

          Show
          Owen O'Malley added a comment - I just wanted to get a proposal out there. 66^10 is very big. It is roughly 2^60.
          Hide
          Kan Zhang added a comment -

          I had some rough idea for this when I opened HADOOP-4991. Briefly,
          1. The output of Map tasks of a job should be accessed only by Reduce tasks of the same job.
          2. Since currently this access is done over HTTP, I suggest we use HTTP DIGEST authentication mechanism as defined in RFC 2617. This is better than HTTP BASIC authentication since in the case of HTTP DIGEST, the secret key is never sent over to the server in the clear and it allows for mutual authentication.
          3. We should use whatever key length that is recommended by the standard and JCE implementation.
          4. The key is per-job and should be chosen by the JobTracker at job submission and persisted in the job conf in such a way that only tasks of that job + TT/JT can access it. I favor chosen by JT over chosen by JobClient for 2 reasons.

          • The key is considered an internal detail of the M/R framework and should be transparent to anyone outside the M/R cluster, including the JobClient.
          • You don't need to worry about the key being accidentally disclosed before/after being submitted to the JT at the client site.
          Show
          Kan Zhang added a comment - I had some rough idea for this when I opened HADOOP-4991 . Briefly, 1. The output of Map tasks of a job should be accessed only by Reduce tasks of the same job. 2. Since currently this access is done over HTTP, I suggest we use HTTP DIGEST authentication mechanism as defined in RFC 2617. This is better than HTTP BASIC authentication since in the case of HTTP DIGEST, the secret key is never sent over to the server in the clear and it allows for mutual authentication. 3. We should use whatever key length that is recommended by the standard and JCE implementation. 4. The key is per-job and should be chosen by the JobTracker at job submission and persisted in the job conf in such a way that only tasks of that job + TT/JT can access it. I favor chosen by JT over chosen by JobClient for 2 reasons. The key is considered an internal detail of the M/R framework and should be transparent to anyone outside the M/R cluster, including the JobClient. You don't need to worry about the key being accidentally disclosed before/after being submitted to the JT at the client site.
          Hide
          Owen O'Malley added a comment -

          1. Of course

          2. I'm pretty agnostic what the authentication mechanism is, other than I don't want an extra round trip. I don't see any way of doing a hash without an extra round trip on the connection open. On the other hand, doing a password doesn't reveal anything that isn't already known. If the attacker can sniff the network, they already know the secret.

          3. If there is a better key length, we can use it. 66^10 is big enough to be safe.

          4. Of course

          5. The key is per a job of course, but there is no advantage to having the JobTracker pick it. Either way it will be framework code that picks it. Putting it in the job conf is easy, and secure (once MAPREDUCE-181 goes in). Given that the key will be at the JobTracker and all of the TaskTracker's, I don't see the submitting node as a problem.

          Show
          Owen O'Malley added a comment - 1. Of course 2. I'm pretty agnostic what the authentication mechanism is, other than I don't want an extra round trip. I don't see any way of doing a hash without an extra round trip on the connection open. On the other hand, doing a password doesn't reveal anything that isn't already known. If the attacker can sniff the network, they already know the secret. 3. If there is a better key length, we can use it. 66^10 is big enough to be safe. 4. Of course 5. The key is per a job of course, but there is no advantage to having the JobTracker pick it. Either way it will be framework code that picks it. Putting it in the job conf is easy, and secure (once MAPREDUCE-181 goes in). Given that the key will be at the JobTracker and all of the TaskTracker's, I don't see the submitting node as a problem.
          Hide
          Devaraj Das added a comment -

          Summarizing some offline discussions:
          1. Performance issues to do with 1.5 extra round trips to the TaskTracker for HTTP Digest authentication could be a significant cost when the map outputs are small.
          2. Instead of that, can we do the following:
          2.1. Tasks authenticate to the TaskTrackers by simply passing the key in the URL. This doesn't cost us anything.
          2.2. Map tasks encrypts the final spill file on the map side when they are written to disk (and reducers decrypt them). This could be done using a key different from the shuffle key used in 2.1.
          The idea is that at some point we anyway should have encrypted map outputs to have maximum security for the intermediate outputs. We can do that on-the-wire via https, or, have encrypted files. The latter should be much less costly when compared with the former. The point of having both 2.1 and 2.2 is to make the transfer very secure without introducing overheads to do with extra round trips for (digest) authentication.

          Thoughts?

          Show
          Devaraj Das added a comment - Summarizing some offline discussions: 1. Performance issues to do with 1.5 extra round trips to the TaskTracker for HTTP Digest authentication could be a significant cost when the map outputs are small. 2. Instead of that, can we do the following: 2.1. Tasks authenticate to the TaskTrackers by simply passing the key in the URL. This doesn't cost us anything. 2.2. Map tasks encrypts the final spill file on the map side when they are written to disk (and reducers decrypt them). This could be done using a key different from the shuffle key used in 2.1. The idea is that at some point we anyway should have encrypted map outputs to have maximum security for the intermediate outputs. We can do that on-the-wire via https, or, have encrypted files. The latter should be much less costly when compared with the former. The point of having both 2.1 and 2.2 is to make the transfer very secure without introducing overheads to do with extra round trips for (digest) authentication. Thoughts?
          Hide
          Owen O'Malley added a comment -

          To clarify, in this jira you intend to:

          1. Use a job specific random key, which is included in the URL of the fetch.
          2. Allow jobs to request encryption of the map output using a second job specific random key. I assume the configuration boolean would be something like mapred.job.shuffle.encrypt. If the outputs are encrypted, I assume that we checksum the unencrypted data and include the checksum in the encryption.

          Once you have done that, there isn't any motivation to pay for https.

          Show
          Owen O'Malley added a comment - To clarify, in this jira you intend to: 1. Use a job specific random key, which is included in the URL of the fetch. 2. Allow jobs to request encryption of the map output using a second job specific random key. I assume the configuration boolean would be something like mapred.job.shuffle.encrypt. If the outputs are encrypted, I assume that we checksum the unencrypted data and include the checksum in the encryption. Once you have done that, there isn't any motivation to pay for https.
          Hide
          Devaraj Das added a comment -

          1. Use a job specific random key, which is included in the URL of the fetch.

          Yes.

          2. Allow jobs to request encryption of the map output using a second job specific random key. I assume the configuration boolean would be something like mapred.job.shuffle.encrypt.

          Yes.

          If the outputs are encrypted, I assume that we checksum the unencrypted data and include the checksum in the encryption.

          I am not sure whether this is required to be done. The encrypted bytes would be checksummed automatically as we write them to the disk. Do we need to build the extra logic of checksumming the unencrypted bytes (that might be a big deal when we have multiple map output spills that we finally merge at the end, and spill to disk). I propose we just live with the (auto) checksum of the encrypted bytes.

          Show
          Devaraj Das added a comment - 1. Use a job specific random key, which is included in the URL of the fetch. Yes. 2. Allow jobs to request encryption of the map output using a second job specific random key. I assume the configuration boolean would be something like mapred.job.shuffle.encrypt. Yes. If the outputs are encrypted, I assume that we checksum the unencrypted data and include the checksum in the encryption. I am not sure whether this is required to be done. The encrypted bytes would be checksummed automatically as we write them to the disk. Do we need to build the extra logic of checksumming the unencrypted bytes (that might be a big deal when we have multiple map output spills that we finally merge at the end, and spill to disk). I propose we just live with the (auto) checksum of the encrypted bytes.
          Hide
          Devaraj Das added a comment -

          BTW we also think that TTs should authenticate themselves to the reduce tasks (to protect the reduces against malicious TTs that might serve up wrong map outputs). The one way to do that is to have the TT send their passwords in the response to the map output request.

          Show
          Devaraj Das added a comment - BTW we also think that TTs should authenticate themselves to the reduce tasks (to protect the reduces against malicious TTs that might serve up wrong map outputs). The one way to do that is to have the TT send their passwords in the response to the map output request.
          Hide
          Kan Zhang added a comment -

          > The one way to do that is to have the TT send their passwords in the response to the map output request.
          How is TT password generated? The same way as the reduce Task password? They can't be the same password since otherwise TT can simply read the password from reducer request and send it back as response. HTTP Digest authentication makes it possible to use the same password for mutual authentication.

          Show
          Kan Zhang added a comment - > The one way to do that is to have the TT send their passwords in the response to the map output request. How is TT password generated? The same way as the reduce Task password? They can't be the same password since otherwise TT can simply read the password from reducer request and send it back as response. HTTP Digest authentication makes it possible to use the same password for mutual authentication.
          Hide
          Devaraj Das added a comment -

          Yes the thought is to have a different key that the client generates during job submission.

          Show
          Devaraj Das added a comment - Yes the thought is to have a different key that the client generates during job submission.
          Hide
          Devaraj Das added a comment -

          Summarizing:
          1) The JobTracker generates the job token and persists that to the HDFS in the jobId directory
          2) The TaskTracker, as part of localization reads the token file, and localizes it in the secure location on the local disk
          3) ReduceTask reads that file, and computes a HMAC-SHA1 of the URL using the token as the key, and sends it to the TT as part of the Map output request
          4) The TT hosting the map output, reads the same key, and validates the HMAC. If the validation is successful, the TT computes a HMAC-SHA1 of the HMAC-SHA1 that it just received, and sends it as a HTTP header in the map output response.
          5) The reduce task in turn validates that. If the validation is successful, it accepts the map output bytes.

          Show
          Devaraj Das added a comment - Summarizing: 1) The JobTracker generates the job token and persists that to the HDFS in the jobId directory 2) The TaskTracker, as part of localization reads the token file, and localizes it in the secure location on the local disk 3) ReduceTask reads that file, and computes a HMAC-SHA1 of the URL using the token as the key, and sends it to the TT as part of the Map output request 4) The TT hosting the map output, reads the same key, and validates the HMAC. If the validation is successful, the TT computes a HMAC-SHA1 of the HMAC-SHA1 that it just received, and sends it as a HTTP header in the map output response. 5) The reduce task in turn validates that. If the validation is successful, it accepts the map output bytes.
          Hide
          Devaraj Das added a comment -

          Actually, it probably makes sense to write the job token file during the job initialization. The other place is to do it in the submitJob RPC method but it would mean the RPC handler is blocked during the HDFS access.

          Show
          Devaraj Das added a comment - Actually, it probably makes sense to write the job token file during the job initialization. The other place is to do it in the submitJob RPC method but it would mean the RPC handler is blocked during the HDFS access.
          Hide
          Boris Shkolnik added a comment -

          first draft

          Show
          Boris Shkolnik added a comment - first draft
          Hide
          Devaraj Das added a comment -

          Looked at the patch in brief. Some first level comments:
          1) Remove the method setJobTokenFile from JobConf. This is really a TT-Task configuration.
          2) It probably makes sense to have the task read the configuration from the localized file directly. Since the token will be used (later on in a separate jira) to bootstrap even the task<->TT mutual authentication, it it better to check permissions on the localized file before trusting the key. The other option is to have the task read it from the hdfs..
          3) What happens if the shuffle fails due to authentication problems? Maybe that needs to be handled specially w.r.t things like fetch failure notifications, and the reduce task killing itself after some trials..
          4) The JobTracker should create the job-token file during running initTasks for the job in question.

          Show
          Devaraj Das added a comment - Looked at the patch in brief. Some first level comments: 1) Remove the method setJobTokenFile from JobConf. This is really a TT-Task configuration. 2) It probably makes sense to have the task read the configuration from the localized file directly. Since the token will be used (later on in a separate jira) to bootstrap even the task<->TT mutual authentication, it it better to check permissions on the localized file before trusting the key. The other option is to have the task read it from the hdfs.. 3) What happens if the shuffle fails due to authentication problems? Maybe that needs to be handled specially w.r.t things like fetch failure notifications, and the reduce task killing itself after some trials.. 4) The JobTracker should create the job-token file during running initTasks for the job in question.
          Hide
          Kan Zhang added a comment -

          @Devaraj
          > Since the token will be used (later on in a separate jira) to bootstrap even the task<->TT mutual authentication
          Are you talking about Task<->TT heartbeats over RPC? For this connection, I suggest we use a separate key (in the format of Delegation token) that is generated by TT and given to Task just before it is launched. This way the key is known only to the local task and helps prevent Tasks running on other machines connecting this TT accidentally. In terms of implementation, TT can do this in the same way that NN does, e.g., instantiate a DelegationTokenHandler for generating Delegation token and couple it with RPC (no need to persist the MasterKey though).

          Show
          Kan Zhang added a comment - @Devaraj > Since the token will be used (later on in a separate jira) to bootstrap even the task<->TT mutual authentication Are you talking about Task<->TT heartbeats over RPC? For this connection, I suggest we use a separate key (in the format of Delegation token) that is generated by TT and given to Task just before it is launched. This way the key is known only to the local task and helps prevent Tasks running on other machines connecting this TT accidentally. In terms of implementation, TT can do this in the same way that NN does, e.g., instantiate a DelegationTokenHandler for generating Delegation token and couple it with RPC (no need to persist the MasterKey though).
          Hide
          Kan Zhang added a comment -

          > This way the key is known only to the local task
          Also, no need to persist this key as part of the job. This key is just a runtime artifact of the Task and TT.

          Show
          Kan Zhang added a comment - > This way the key is known only to the local task Also, no need to persist this key as part of the job. This key is just a runtime artifact of the Task and TT.
          Hide
          Devaraj Das added a comment -

          Kan the RPC port on the TaskTracker is supposed to be bound to only localhost. So others outside the node in question shouldn't be able to do RPC.
          But lets keep that discussion to a separate jira.

          Show
          Devaraj Das added a comment - Kan the RPC port on the TaskTracker is supposed to be bound to only localhost. So others outside the node in question shouldn't be able to do RPC. But lets keep that discussion to a separate jira.
          Hide
          Devaraj Das added a comment -

          Looked at the patch some more. Few more comments:
          1) The tasktracker needs to maintain a mapping from JobIDs to job-tokens
          2) The call to localizeJobTokenFile should be done before the call to taskController.initializeJob(context) in the TaskTracker.localizeJob method. Could the localizeJobTokenFile be called within TaskTracker.localizeJobFiles
          3) Minor: for the request/response HTTP headers, make the first character upper case
          4) HMacUtil could override the equals method and put in logic for comapring two HMacUtil objects, instead of defining verifyHash.
          5) The Comp class in StoreKeys.java seems to be unused. StoreKeys could be Writable (as opposed to having to define load/store methods)

          For the case where a reduce task fails due to the TaskTracker(s) not being authentic, we probably need care. Two things might happen - the JobTracker might get enough notifications from other reduces in the system, and it might just decide to re-execute the map. The other situation is what is bothering me - the reduce task would kill itself after a certain threshold number of trials. This would be bad. IIRC it is not predictable which one could happen first.

          Show
          Devaraj Das added a comment - Looked at the patch some more. Few more comments: 1) The tasktracker needs to maintain a mapping from JobIDs to job-tokens 2) The call to localizeJobTokenFile should be done before the call to taskController.initializeJob(context) in the TaskTracker.localizeJob method. Could the localizeJobTokenFile be called within TaskTracker.localizeJobFiles 3) Minor: for the request/response HTTP headers, make the first character upper case 4) HMacUtil could override the equals method and put in logic for comapring two HMacUtil objects, instead of defining verifyHash. 5) The Comp class in StoreKeys.java seems to be unused. StoreKeys could be Writable (as opposed to having to define load/store methods) For the case where a reduce task fails due to the TaskTracker(s) not being authentic, we probably need care. Two things might happen - the JobTracker might get enough notifications from other reduces in the system, and it might just decide to re-execute the map. The other situation is what is bothering me - the reduce task would kill itself after a certain threshold number of trials. This would be bad. IIRC it is not predictable which one could happen first.
          Hide
          Devaraj Das added a comment -

          My worry on the reduce task killing itself can be ignored. That is the right thing to happen as Boris and I discussed offline..

          Show
          Devaraj Das added a comment - My worry on the reduce task killing itself can be ignored. That is the right thing to happen as Boris and I discussed offline..
          Hide
          Boris Shkolnik added a comment -

          1) The tasktracker needs to maintain a mapping from JobIDs to job-tokens

          done

          2) The call to localizeJobTokenFile should be done before the call to taskController.initializeJob(context) in the TaskTracker.localizeJob method. Could the localizeJobTokenFile be called within TaskTracker.localizeJobFiles

          3) Minor: for the request/response HTTP headers, make the first character upper case

          done

          4) HMacUtil could override the equals method and put in logic for comapring two HMacUtil objects, instead of defining verifyHash.

          We are note really comparing HMacUtil objects, they are just utilities. So I think verifyHash() should be more logical.

          5) The Comp class in StoreKeys.java seems to be unused. StoreKeys could be Writable (as opposed to having to define load/store methods)

          Comp is used in the TreeMap constructor as the comparator.

          Also added synchronization around the map of StoreKeys updates in TaskTracker.

          Show
          Boris Shkolnik added a comment - 1) The tasktracker needs to maintain a mapping from JobIDs to job-tokens done 2) The call to localizeJobTokenFile should be done before the call to taskController.initializeJob(context) in the TaskTracker.localizeJob method. Could the localizeJobTokenFile be called within TaskTracker.localizeJobFiles 3) Minor: for the request/response HTTP headers, make the first character upper case done 4) HMacUtil could override the equals method and put in logic for comapring two HMacUtil objects, instead of defining verifyHash. We are note really comparing HMacUtil objects, they are just utilities. So I think verifyHash() should be more logical. 5) The Comp class in StoreKeys.java seems to be unused. StoreKeys could be Writable (as opposed to having to define load/store methods) Comp is used in the TreeMap constructor as the comparator. Also added synchronization around the map of StoreKeys updates in TaskTracker.
          Hide
          Boris Shkolnik added a comment -

          added test

          Show
          Boris Shkolnik added a comment - added test
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12424422/MAPREDUCE-1026-2.patch
          against trunk revision 834284.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The patch appears to cause tar ant target to fail.

          -1 findbugs. The patch appears to cause Findbugs to fail.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/233/testReport/
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/233/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/233/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424422/MAPREDUCE-1026-2.patch against trunk revision 834284. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. -1 findbugs. The patch appears to cause Findbugs to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/233/testReport/ Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/233/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/233/console This message is automatically generated.
          Hide
          Boris Shkolnik added a comment -

          fixed warnings

          Show
          Boris Shkolnik added a comment - fixed warnings
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12424539/MAPREDUCE-1026-3.patch
          against trunk revision 834284.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 release audit. The applied patch generated 161 release audit warnings (more than the trunk's current 159 warnings).

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/235/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/235/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/235/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/235/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/235/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424539/MAPREDUCE-1026-3.patch against trunk revision 834284. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 161 release audit warnings (more than the trunk's current 159 warnings). -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/235/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/235/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/235/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/235/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/235/console This message is automatically generated.
          Hide
          Boris Shkolnik added a comment -

          review notes implemented.

          Show
          Boris Shkolnik added a comment - review notes implemented.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12425287/MAPREDUCE-1026-9.patch
          against trunk revision 881536.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/142/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/142/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/142/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/142/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425287/MAPREDUCE-1026-9.patch against trunk revision 881536. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/142/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/142/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/142/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/142/console This message is automatically generated.
          Hide
          Boris Shkolnik added a comment -

          ivy.xml update for contribs

          Show
          Boris Shkolnik added a comment - ivy.xml update for contribs
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12425383/MAPREDUCE-1026-12.patch
          against trunk revision 881673.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/250/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/250/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/250/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/250/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425383/MAPREDUCE-1026-12.patch against trunk revision 881673. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/250/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/250/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/250/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/250/console This message is automatically generated.
          Hide
          Boris Shkolnik added a comment -

          added port number to the hashed url.

          Show
          Boris Shkolnik added a comment - added port number to the hashed url.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12425415/MAPREDUCE-1026-13.patch
          against trunk revision 881673.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/251/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/251/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/251/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/251/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425415/MAPREDUCE-1026-13.patch against trunk revision 881673. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/251/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/251/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/251/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/251/console This message is automatically generated.
          Hide
          Boris Shkolnik added a comment -

          addressed minor findbug nit

          Show
          Boris Shkolnik added a comment - addressed minor findbug nit
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12425504/MAPREDUCE-1026-14.patch
          against trunk revision 881673.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/146/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/146/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/146/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/146/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425504/MAPREDUCE-1026-14.patch against trunk revision 881673. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/146/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/146/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/146/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/146/console This message is automatically generated.
          Hide
          Boris Shkolnik added a comment -

          moved secureShuffleUtils and JobTokens into o.a.h.mapreduce.security package

          Show
          Boris Shkolnik added a comment - moved secureShuffleUtils and JobTokens into o.a.h.mapreduce.security package
          Hide
          Devaraj Das added a comment -

          I just committed this. Thanks, Boris!

          Show
          Devaraj Das added a comment - I just committed this. Thanks, Boris!
          Hide
          Konstantin Boudnik added a comment -

          Technically, a JIRA has to be reviewed before the commit could happen.

          Show
          Konstantin Boudnik added a comment - Technically, a JIRA has to be reviewed before the commit could happen.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #126 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/126/)
          . Does mutual authentication of the shuffle transfers using a shared JobTracker generated key. Contributed by Boris Shkolnik.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #126 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/126/ ) . Does mutual authentication of the shuffle transfers using a shared JobTracker generated key. Contributed by Boris Shkolnik.
          Hide
          Devaraj Das added a comment -

          I missed some LOG.debug statements that creates string objects unnecessarily. We should make the LOGs conditional on 'if (isDebugEnabled)' in a separate jira.

          Show
          Devaraj Das added a comment - I missed some LOG.debug statements that creates string objects unnecessarily. We should make the LOGs conditional on 'if (isDebugEnabled)' in a separate jira.
          Hide
          Boris Shkolnik added a comment -

          created MAPREDUCE-1236 for LOG.isdebugenabled issue

          Show
          Boris Shkolnik added a comment - created MAPREDUCE-1236 for LOG.isdebugenabled issue
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #162 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/162/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #162 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/162/ )
          Hide
          Aaron Kimball added a comment -

          I am finding a NullPointerException in Shuffle when I run things with the LocalJobRunner:

          09/12/10 16:08:58 WARN mapred.LocalJobRunner: job_local_0001
          java.lang.NullPointerException
          	at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:108)
          	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:358)
          	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:299)
          

          reduceTask.getJobTokens() is returning null; I can't see anyplace in LocalJobRunner where the JobTokens object is being initialized. I think this patch is to blame?

          Show
          Aaron Kimball added a comment - I am finding a NullPointerException in Shuffle when I run things with the LocalJobRunner: 09/12/10 16:08:58 WARN mapred.LocalJobRunner: job_local_0001 java.lang.NullPointerException at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:108) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:358) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:299) reduceTask.getJobTokens() is returning null; I can't see anyplace in LocalJobRunner where the JobTokens object is being initialized. I think this patch is to blame?
          Hide
          Devaraj Das added a comment -

          I don't think so. In the local mode, shuffle shouldn't be invoked at all...

          Show
          Devaraj Das added a comment - I don't think so. In the local mode, shuffle shouldn't be invoked at all...
          Hide
          Jitendra Nath Pandey added a comment -

          Patch for Hadoop-20 added.

          Show
          Jitendra Nath Pandey added a comment - Patch for Hadoop-20 added.
          Hide
          Devaraj Das added a comment -

          This fixes a bug in the original Y20 backport. Not for commit here.

          Show
          Devaraj Das added a comment - This fixes a bug in the original Y20 backport. Not for commit here.

            People

            • Assignee:
              Boris Shkolnik
              Reporter:
              Owen O'Malley
            • Votes:
              1 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development