Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6166

Reducers do not validate checksum of map outputs when fetching directly to disk

    Details

    • Target Version/s:

      Description

      In very large map/reduce jobs (50000 maps, 2500 reducers), the intermediate map partition output gets corrupted on disk on the map side. If this corrupted map output is too large to shuffle in memory, the reducer streams it to disk without validating the checksum. In jobs this large, it could take hours before the reducer finally tries to read the corrupted file and fails. Since retries of the failed reduce attempt will also take hours, this delay in discovering the failure is multiplied greatly.

      1. MAPREDUCE-6166.v1.201411221941.txt
        6 kB
        Eric Payne
      2. MAPREDUCE-6166.v2.201411251627.txt
        6 kB
        Eric Payne
      3. MAPREDUCE-6166.v3.txt
        6 kB
        Eric Payne
      4. MAPREDUCE-6166.v4.txt
        7 kB
        Eric Payne
      5. MAPREDUCE-6166.v5.txt
        7 kB
        Eric Payne

        Activity

        Hide
        eepayne Eric Payne added a comment -

        This patch makes a change to OnDiskMapOutput#shuffle to wrap the input stream with IFileInputStream#readWithChecksum. The result is that when a corrupted map output file is shuffled directly to disk on the reduce side, the corruption is caught immediately, and the map is tried again (due to "Too many shuffle failures").

        This behavior is similar to what happens when a corrupted file is shuffled to memory as opposed to disk.

        Show
        eepayne Eric Payne added a comment - This patch makes a change to OnDiskMapOutput#shuffle to wrap the input stream with IFileInputStream#readWithChecksum. The result is that when a corrupted map output file is shuffled directly to disk on the reduce side, the corruption is caught immediately, and the map is tried again (due to "Too many shuffle failures"). This behavior is similar to what happens when a corrupted file is shuffled to memory as opposed to disk.
        Hide
        eepayne Eric Payne added a comment -

        Jason Lowe, would you like to take a look at this patch?
        Thanks.

        Show
        eepayne Eric Payne added a comment - Jason Lowe , would you like to take a look at this patch? Thanks.
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12683171/MAPREDUCE-6166.v1.201411221941.txt
        against trunk revision a4df9ee.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5044//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5044//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12683171/MAPREDUCE-6166.v1.201411221941.txt against trunk revision a4df9ee. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5044//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5044//console This message is automatically generated.
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Hi Eric Payne, thanks for reporting the issue. If you look at InMemoryMapOutput#shuffle, first thing it does is overwriting the passed InputStream with the IFileInputStream-wrapped version of it. So if we simply move this logic from there to the caller of mapOutput.shuffle, i.e., Fetcher#setupShuffleConnection, this common behavior is automatically consumed by both InMemory and OnDisk and we don't have to modify the latter.

        Show
        jira.shegalov Gera Shegalov added a comment - Hi Eric Payne , thanks for reporting the issue. If you look at InMemoryMapOutput#shuffle , first thing it does is overwriting the passed InputStream with the IFileInputStream-wrapped version of it. So if we simply move this logic from there to the caller of mapOutput.shuffle , i.e., Fetcher#setupShuffleConnection , this common behavior is automatically consumed by both InMemory and OnDisk and we don't have to modify the latter.
        Hide
        jlowe Jason Lowe added a comment -

        I'd be a little wary of doing this. I believe the MergeManager and MapOutput classes are being used by third-party software like SyncSort, see MAPREDUCE-4808, MAPREDUCE-4039, and related JIRAs. By changing the input stream being passed to mapOutput.shuffle to an IFileInputStream then calling read() on the data subtly changes the behavior. Before it was an IFileInput stream, calling read() would read all the data and the checksum. After it's wrapped at a higher level it won't. If the third-party software is itself wrapping the stream with IFileInputStream to handle the trailing checksum then after this change the stream would be double-wrapped and checksum verification would fail.

        Show
        jlowe Jason Lowe added a comment - I'd be a little wary of doing this. I believe the MergeManager and MapOutput classes are being used by third-party software like SyncSort, see MAPREDUCE-4808 , MAPREDUCE-4039 , and related JIRAs. By changing the input stream being passed to mapOutput.shuffle to an IFileInputStream then calling read() on the data subtly changes the behavior. Before it was an IFileInput stream, calling read() would read all the data and the checksum. After it's wrapped at a higher level it won't. If the third-party software is itself wrapping the stream with IFileInputStream to handle the trailing checksum then after this change the stream would be double-wrapped and checksum verification would fail.
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Hi Jason Lowe, thanks for pointing out the 3rd-party use cases, I completely forgot about it. So how about we make it explicit that InMemoryMapOutput and OnDiskMapOutput are different from 3rd-party (so I don't forget it next time) by having it subclass a common class. We can put there the common IFileInputStream wrapping logic, and maybe even move private final MergeManagerImpl<K, V> merger;.

        Show
        jira.shegalov Gera Shegalov added a comment - Hi Jason Lowe , thanks for pointing out the 3rd-party use cases, I completely forgot about it. So how about we make it explicit that InMemoryMapOutput and OnDiskMapOutput are different from 3rd-party (so I don't forget it next time) by having it subclass a common class. We can put there the common IFileInputStream wrapping logic, and maybe even move private final MergeManagerImpl<K, V> merger; .
        Hide
        jlowe Jason Lowe added a comment -

        I'm not sure we need all the boilerplate of an extra class to save one line of code (two if we count the MergeManager member), and I'm not sure that extra class alone will make it clear that MapOutput can be used externally. IMHO if we want to do this kind of refactoring then that can be done as another JIRA.

        Show
        jlowe Jason Lowe added a comment - I'm not sure we need all the boilerplate of an extra class to save one line of code (two if we count the MergeManager member), and I'm not sure that extra class alone will make it clear that MapOutput can be used externally. IMHO if we want to do this kind of refactoring then that can be done as another JIRA.
        Hide
        jira.shegalov Gera Shegalov added a comment -

        This patch adds one more common instance field, configuration: JobConf jobConf . It should be final by the way.

        Show
        jira.shegalov Gera Shegalov added a comment - This patch adds one more common instance field, configuration: JobConf jobConf . It should be final by the way.
        Hide
        eepayne Eric Payne added a comment -

        Thank you very much Gera Shegalov and Jason Lowe for your comments and help in making this patch better.

        If it's okay, I would like to

        1. update this patch with the final keyword on JobConf jobConf
        2. Create a separate Jira for refactoring the code to inherit from a common class.

        Uploading patch to cover #1.

        Show
        eepayne Eric Payne added a comment - Thank you very much Gera Shegalov and Jason Lowe for your comments and help in making this patch better. If it's okay, I would like to update this patch with the final keyword on JobConf jobConf Create a separate Jira for refactoring the code to inherit from a common class. Uploading patch to cover #1.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12683592/MAPREDUCE-6166.v2.201411251627.txt
        against trunk revision 61a2510.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        -1 eclipse:eclipse. The patch failed to build with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5050//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5050//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12683592/MAPREDUCE-6166.v2.201411251627.txt against trunk revision 61a2510. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. -1 eclipse:eclipse . The patch failed to build with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in . +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5050//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5050//console This message is automatically generated.
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Sounds good, Eric Payne.
        I have a few comments then. Some are in light of a follow-up JIRA.

        update this patch with the final keyword on JobConf jobConf

        Let us make the instance variable type a more general Configuration as we are not doing anything specific to JobConf.

        Instead of introducing a new local variable iFin in OnDiskMapOutput#shuffle, we can overwrite it as in InMemoryMapOutput#shuffle.
        We can either capture the shuffle size in an instance variable as InMemoryMapOutput does implicitly via memory.length. Or we can set

        long bytesLeft = compressedLength - ((IFileInputStream)input).getSize()
        

        Then we don't need to touch the line input.read to do readWithChecksum

        Good call adding finally with close.

        I also have some comments for the test:
        ios.finish() should be removed because it's redundant: IFileOutputStream#close() will call it as well.

        We don't need PrintStream wrapping and we need to be careful not to leak file descriptors in case I/O fails.

           new PrintStream(fout).print(bout.toString());
           fout.close();
        

        Should be something like:

            try {
              fout.write(bout.toByteArray());
            } finally {
              fout.close();
            }
        

        Similarly we need to make sure that fin.close() is in a try-finally block enclosing header and shuffle read.
        Let us not do

         catch(Exception e) {
             fail("OnDiskMapOutput.shuffle did not process the map partition file");
        

        It's redundant because the exception is failing the test already.

        Same PrintStream and fout.close remarks for the code creating the corrupted file
        dataSize/2: I believe Sun Java Coding Style require spaces around arithmetic operations.

        In the fragment where we expect the checksum to fail, fin.close() should be in some finally.
        catch(Exception e) is too broad. Let us be more specific and maybe even log it:

              } catch(ChecksumException e) {
                LOG.info("Expected checksum exception thrown.", e);
              }
        

        Thinking a bit more about the file.out, it does not seem to be cleaned up after the test has finished. But we probably don't even need to create files, we can simply use new ByteArrayInputStream(bout.toByteArray()) and new ByteArrayInputStream(corrupted) as input.

        Show
        jira.shegalov Gera Shegalov added a comment - Sounds good, Eric Payne . I have a few comments then. Some are in light of a follow-up JIRA. update this patch with the final keyword on JobConf jobConf Let us make the instance variable type a more general Configuration as we are not doing anything specific to JobConf . Instead of introducing a new local variable iFin in OnDiskMapOutput#shuffle , we can overwrite it as in InMemoryMapOutput#shuffle . We can either capture the shuffle size in an instance variable as InMemoryMapOutput does implicitly via memory.length . Or we can set long bytesLeft = compressedLength - ((IFileInputStream)input).getSize() Then we don't need to touch the line input.read to do readWithChecksum Good call adding finally with close . I also have some comments for the test: ios.finish() should be removed because it's redundant: IFileOutputStream#close() will call it as well. We don't need PrintStream wrapping and we need to be careful not to leak file descriptors in case I/O fails. new PrintStream(fout).print(bout.toString()); fout.close(); Should be something like: try { fout.write(bout.toByteArray()); } finally { fout.close(); } Similarly we need to make sure that fin.close() is in a try-finally block enclosing header and shuffle read. Let us not do catch (Exception e) { fail( "OnDiskMapOutput.shuffle did not process the map partition file" ); It's redundant because the exception is failing the test already. Same PrintStream and fout.close remarks for the code creating the corrupted file dataSize/2 : I believe Sun Java Coding Style require spaces around arithmetic operations. In the fragment where we expect the checksum to fail, fin.close() should be in some finally. catch(Exception e) is too broad. Let us be more specific and maybe even log it: } catch (ChecksumException e) { LOG.info( "Expected checksum exception thrown." , e); } Thinking a bit more about the file.out, it does not seem to be cleaned up after the test has finished. But we probably don't even need to create files, we can simply use new ByteArrayInputStream(bout.toByteArray()) and new ByteArrayInputStream(corrupted) as input.
        Hide
        eepayne Eric Payne added a comment -

        Gera Shegalov], thank you very much for your detailed analysis of this patch.

        I have opened MAPREDUCE-6174 to cover the parent class for InMemoryMapOutput and OnDiskMapOutput, and I will continue to work on the above-mentioned code points.

        Show
        eepayne Eric Payne added a comment - Gera Shegalov ], thank you very much for your detailed analysis of this patch. I have opened MAPREDUCE-6174 to cover the parent class for InMemoryMapOutput and OnDiskMapOutput , and I will continue to work on the above-mentioned code points.
        Hide
        eepayne Eric Payne added a comment -

        Gera Shegalov, I have one question about re-using the input.read code in OnDisMapOutput.

        We can set

        long bytesLeft = compressedLength - ((IFileInputStream)input).getSize()
        

        Then we don't need to touch the line input.read to do readWithChecksum

        In this case, input.read does not write the checksum to the disk while readWithChecksum will write it.

        Since OnDiskMapOutput is shuffling the whole IFile to disk, the checksum is needed later during the last merge pass when the IFile contents are read again and decompressed.

        If we were to implement input.read as above, it looks like we would still need to add something like the following in order to put the checksum on the disk:

        disk.write(((IFileInputStream)input).getChecksum(), 0, (int) ((IFileInputStream)input).getSize());
        
        Show
        eepayne Eric Payne added a comment - Gera Shegalov , I have one question about re-using the input.read code in OnDisMapOutput . We can set long bytesLeft = compressedLength - ((IFileInputStream)input).getSize() Then we don't need to touch the line input.read to do readWithChecksum In this case, input.read does not write the checksum to the disk while readWithChecksum will write it. Since OnDiskMapOutput is shuffling the whole IFile to disk, the checksum is needed later during the last merge pass when the IFile contents are read again and decompressed. If we were to implement input.read as above, it looks like we would still need to add something like the following in order to put the checksum on the disk: disk.write(((IFileInputStream)input).getChecksum(), 0, ( int ) ((IFileInputStream)input).getSize());
        Hide
        eepayne Eric Payne added a comment -

        Just to clarify, neither read nor readWithChecksum writes anything to disk. They both read data into the byte buffer, which then is written to disk by shuffle.

        Show
        eepayne Eric Payne added a comment - Just to clarify, neither read nor readWithChecksum writes anything to disk. They both read data into the byte buffer, which then is written to disk by shuffle .
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Thanks for commenting Eric Payne!

        Since OnDiskMapOutput is shuffling the whole IFile to disk, the checksum is needed later during the last merge pass when the IFile contents are read again and decompressed.

        Can you clarify where in the code it's required to keep the original checksum?

        What I see is that after your modifications, OnDiskMapOutput is guaranteed to validate the contents of the destination buffer against the remote checksum. Then this contents are written out using LocalFileSystem, which will create again an on-disk checksum because it's based on ChecksumFileSystem. Are you proposing an optimization that the checksum is not computed twice when shuffling straight to disk by using RawLocalFileSystem? Can we defer it to another JIRA?

        Show
        jira.shegalov Gera Shegalov added a comment - Thanks for commenting Eric Payne ! Since OnDiskMapOutput is shuffling the whole IFile to disk, the checksum is needed later during the last merge pass when the IFile contents are read again and decompressed. Can you clarify where in the code it's required to keep the original checksum? What I see is that after your modifications, OnDiskMapOutput is guaranteed to validate the contents of the destination buffer against the remote checksum. Then this contents are written out using LocalFileSystem , which will create again an on-disk checksum because it's based on ChecksumFileSystem . Are you proposing an optimization that the checksum is not computed twice when shuffling straight to disk by using RawLocalFileSystem ? Can we defer it to another JIRA?
        Hide
        eepayne Eric Payne added a comment -

        Gera Shegalov, I'm sorry for not being clear.

        Can you clarify where in the code it's required to keep the original checksum?
        Then this contents are written out using LocalFileSystem, which will create again an on-disk checksum because it's based on ChecksumFileSystem.

        I don't think the IFile format is related to ChecksumFileSystem.

        The IFile checksum is expected to be the last 4 bytes of the IFile, and if we use input.read as below, those 4 bytes of checksum are not copied into buf:

            input = new IFileInputStream(input, compressedLength, conf);
            // Copy data to local-disk
            long bytesLeft = compressedLength - ((IFileInputStream)input).getSize();
            try {
              final int BYTES_TO_READ = 64 * 1024;
              byte[] buf = new byte[BYTES_TO_READ];
              while (bytesLeft > 0) {
                int n = input.read(buf, 0, (int) Math.min(bytesLeft, BYTES_TO_READ));
        ...
        

        However, if we use readWithChecksum as below, the checksum is copied into buf:

            input = new IFileInputStream(input, compressedLength, conf);
            // Copy data to local-disk
            long bytesLeft = compressedLength;
            try {
              final int BYTES_TO_READ = 64 * 1024;
              byte[] buf = new byte[BYTES_TO_READ];
              while (bytesLeft > 0) {
                int n = ((IFileInputStream)input).read(buf, 0, (int) Math.min(bytesLeft, BYTES_TO_READ));
        ...
        

        Without those last 4 bytes of checksum on the end of the IFile format, the final read will fail during the last merge pass with a chedksum error.

        Show
        eepayne Eric Payne added a comment - Gera Shegalov , I'm sorry for not being clear. Can you clarify where in the code it's required to keep the original checksum? Then this contents are written out using LocalFileSystem , which will create again an on-disk checksum because it's based on ChecksumFileSystem . I don't think the IFile format is related to ChecksumFileSystem . The IFile checksum is expected to be the last 4 bytes of the IFile , and if we use input.read as below, those 4 bytes of checksum are not copied into buf : input = new IFileInputStream(input, compressedLength, conf); // Copy data to local-disk long bytesLeft = compressedLength - ((IFileInputStream)input).getSize(); try { final int BYTES_TO_READ = 64 * 1024; byte [] buf = new byte [BYTES_TO_READ]; while (bytesLeft > 0) { int n = input.read(buf, 0, ( int ) Math .min(bytesLeft, BYTES_TO_READ)); ... However, if we use readWithChecksum as below, the checksum is copied into buf : input = new IFileInputStream(input, compressedLength, conf); // Copy data to local-disk long bytesLeft = compressedLength; try { final int BYTES_TO_READ = 64 * 1024; byte [] buf = new byte [BYTES_TO_READ]; while (bytesLeft > 0) { int n = ((IFileInputStream)input).read(buf, 0, ( int ) Math .min(bytesLeft, BYTES_TO_READ)); ... Without those last 4 bytes of checksum on the end of the IFile format, the final read will fail during the last merge pass with a chedksum error.
        Hide
        eepayne Eric Payne added a comment -

        I'm sorry. The second code snippet should have been this:

            input = new IFileInputStream(input, compressedLength, conf);
            // Copy data to local-disk
            long bytesLeft = compressedLength;
            try {
              final int BYTES_TO_READ = 64 * 1024;
              byte[] buf = new byte[BYTES_TO_READ];
              while (bytesLeft > 0) {
                int n = ((IFileInputStream)input).readWithChecksum(buf, 0, (int) Math.min(bytesLeft, BYTES_TO_READ));
        ...
        
        Show
        eepayne Eric Payne added a comment - I'm sorry. The second code snippet should have been this: input = new IFileInputStream(input, compressedLength, conf); // Copy data to local-disk long bytesLeft = compressedLength; try { final int BYTES_TO_READ = 64 * 1024; byte [] buf = new byte [BYTES_TO_READ]; while (bytesLeft > 0) { int n = ((IFileInputStream)input).readWithChecksum(buf, 0, ( int ) Math .min(bytesLeft, BYTES_TO_READ)); ...
        Hide
        eepayne Eric Payne added a comment -

        Gera Shegalov, did that make sense?

        Show
        eepayne Eric Payne added a comment - Gera Shegalov , did that make sense?
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Hi Eric Payne, sorry for the delay. I knew what modifications you were talking about. But I did not have the time to verify and convince myself whether this double checksumming was really needed in the Merger. I did though run a version of the patch that would implement my suggestion above through a couple of UT and did not see any issues.

        mapreduce.task.reduce.TestFetcher
        mapreduce.task.reduce.TestMergeManager
        mapreduce.task.reduce.TestMerger
        

        That's why I hoped you'd point me where a failure occurs. That's my current status about it. I hope to get back to it soon again.

        Show
        jira.shegalov Gera Shegalov added a comment - Hi Eric Payne , sorry for the delay. I knew what modifications you were talking about. But I did not have the time to verify and convince myself whether this double checksumming was really needed in the Merger. I did though run a version of the patch that would implement my suggestion above through a couple of UT and did not see any issues. mapreduce.task.reduce.TestFetcher mapreduce.task.reduce.TestMergeManager mapreduce.task.reduce.TestMerger That's why I hoped you'd point me where a failure occurs. That's my current status about it. I hope to get back to it soon again.
        Hide
        eepayne Eric Payne added a comment -

        Thanks Gera Shegalov, for taking time to investigate this issue.

        The unit tests are not catching this. I am testing this in a 10-node secure cluster.

        I am running wordcount on a file that 1) has no repeated words and 2) is large enough to ensure that at least some of the map outputs are shuffled to disk:

        $ $HADOOP_PREFIX/bin/hadoop fs -cat Input/NoRecurringWords/NoRecurringWords-part0.txt | wc -l -w
         4008920 4008920
        $ $HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-$HADOOP_VERSION.jar wordcount Input/NoRecurringWords/NoRecurringWords-part0.txt Output/01
        

        If I implement the fix by adjusting bytesLeft and leaving input.read() alone, the OnDiskMapOutput#shuffle does succeed and writes the map output to disk in a temporary location. However, when the Merger goes to read that temporary file (via RawKVIteratorReader in MergeManager) , it fails with the following exception:

        2014-12-04 17:47:12,040 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.fs.ChecksumException: Checksum Error
        	at org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:228)
        	at org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:152)
        	at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:127)
        	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:98)
        	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
        	at org.apache.hadoop.io.IOUtils.wrappedReadForCompressedData(IOUtils.java:170)
        	at org.apache.hadoop.mapred.IFile$Reader.readData(IFile.java:378)
        	at org.apache.hadoop.mapred.IFile$Reader.nextRawKey(IFile.java:426)
        	at org.apache.hadoop.mapred.Merger$Segment.nextRawKey(Merger.java:337)
        	at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:519)
        	at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:547)
        	at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:601)
        ...
        

        This is because the checksum is not on the end of the temporary file.
        On the other hand, if I leave bytesLeft alone and instead call ((IFileInputStream)input).readWithChecksum, the reducers all succeed. This is because readWithChecksum not only compares the input against the checksum, it also includes the checksum at the end of the byte buffer.

        Please let me know if that makes sense.

        Show
        eepayne Eric Payne added a comment - Thanks Gera Shegalov , for taking time to investigate this issue. The unit tests are not catching this. I am testing this in a 10-node secure cluster. I am running wordcount on a file that 1) has no repeated words and 2) is large enough to ensure that at least some of the map outputs are shuffled to disk: $ $HADOOP_PREFIX/bin/hadoop fs -cat Input/NoRecurringWords/NoRecurringWords-part0.txt | wc -l -w 4008920 4008920 $ $HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-$HADOOP_VERSION.jar wordcount Input/NoRecurringWords/NoRecurringWords-part0.txt Output/01 If I implement the fix by adjusting bytesLeft and leaving input.read() alone, the OnDiskMapOutput#shuffle does succeed and writes the map output to disk in a temporary location. However, when the Merger goes to read that temporary file (via RawKVIteratorReader in MergeManager ) , it fails with the following exception: 2014-12-04 17:47:12,040 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.fs.ChecksumException: Checksum Error at org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:228) at org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:152) at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:127) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:98) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.wrappedReadForCompressedData(IOUtils.java:170) at org.apache.hadoop.mapred.IFile$Reader.readData(IFile.java:378) at org.apache.hadoop.mapred.IFile$Reader.nextRawKey(IFile.java:426) at org.apache.hadoop.mapred.Merger$Segment.nextRawKey(Merger.java:337) at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:519) at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:547) at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:601) ... This is because the checksum is not on the end of the temporary file. On the other hand, if I leave bytesLeft alone and instead call ((IFileInputStream)input).readWithChecksum , the reducers all succeed. This is because readWithChecksum not only compares the input against the checksum, it also includes the checksum at the end of the byte buffer. Please let me know if that makes sense.
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Eric Payne, thanks for your reproducer. Would you mind uploading your patch to run it through Jenkins. If there is no test catching it, we should add it. Now it is more clear why we would need the checksum via IFile . Now I feel more strongly that we need to get rid of the checksumming LocalFileSystem just like in MapTask.

        Show
        jira.shegalov Gera Shegalov added a comment - Eric Payne , thanks for your reproducer. Would you mind uploading your patch to run it through Jenkins. If there is no test catching it, we should add it. Now it is more clear why we would need the checksum via IFile . Now I feel more strongly that we need to get rid of the checksumming LocalFileSystem just like in MapTask.
        Hide
        eepayne Eric Payne added a comment -
        Show
        eepayne Eric Payne added a comment - Gera Shegalov , your comments outlined in https://issues.apache.org/jira/browse/MAPREDUCE-6166?focusedCommentId=14225864&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14225864 have been included in the attached patch ( MAPREDUCE-6166 .v3.txt). This patch is using readWithChecksum .
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12685168/MAPREDUCE-6166.v3.txt
        against trunk revision 0653918.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5060//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5060//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685168/MAPREDUCE-6166.v3.txt against trunk revision 0653918. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5060//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5060//console This message is automatically generated.
        Hide
        jira.shegalov Gera Shegalov added a comment -

        thanks for an updated patch, Eric Payne! I'll post my comments soon.

        I'am uploading a modified patch based on my previous review only with the intention to see what if any tests would catch a missing checksum in ondisk-shuffle.

        Show
        jira.shegalov Gera Shegalov added a comment - thanks for an updated patch, Eric Payne ! I'll post my comments soon. I'am uploading a modified patch based on my previous review only with the intention to see what if any tests would catch a missing checksum in ondisk-shuffle.
        Hide
        hadoopqa Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12685532/MAPREDUCE-6166-gera-missing-cs-test.patch
        against trunk revision e227fb8.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5061//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5061//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685532/MAPREDUCE-6166-gera-missing-cs-test.patch against trunk revision e227fb8. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5061//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5061//console This message is automatically generated.
        Hide
        eepayne Eric Payne added a comment -

        Thank you very much, Gera Shegalov

        I'am uploading a modified patch based on my previous review only with the intention to see what if any tests would catch a missing checksum in ondisk-shuffle.

        The last segment of the test I added (TestFetcher#testCorruptedIFile) will catch that the checksum is missing or incorrect when it tries to read the IFile that was shuffled to disk by OnDiskMapOutput#shuffle

        Show
        eepayne Eric Payne added a comment - Thank you very much, Gera Shegalov I'am uploading a modified patch based on my previous review only with the intention to see what if any tests would catch a missing checksum in ondisk-shuffle. The last segment of the test I added ( TestFetcher#testCorruptedIFile ) will catch that the checksum is missing or incorrect when it tries to read the IFile that was shuffled to disk by OnDiskMapOutput#shuffle
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Thanks, It makes sense, Eric Payne. I just wanted to confirm that there are no existing tests catching missing checksum in on-disk shuffle (confirmed!).

        For the production code I have only one comment:
        I am now convinced FileSystem.getLocal(conf) should be FileSystem.getLocal(conf).getRaw()
        In the corresponding OnDiskMapOutput constructor.

        TestFetcher#testCorruptedFiles comments:

        Nit: Lower-case FETCHER because it's a final variable.

        FileSystem fs need to be changed FileSystem.getLocal(conf).getRaw()} just like in production code, and it could be made an instance variable because we can use it for cleanup in tearDown.

        Path p should be more mnemonic, and it it's better to use a root directory that matches the test method so we can use it for cleanup in tearDown

        Path outputPath = new Path(name.getMethodName() + "/foo");
        

        Instead of reverse-engineering the path shuffledToDisk, we can use

        Path shuffledToDisk = OnDiskMapOutput.getTempPath(outputPath, fetcher);
        457	    ios.write(mapData.getBytes());
        458	    ios.close();
        

        ios.close() should be in a finally block.

        476	    bin = new ByteArrayInputStream(corrupted);
        477	    // Read past the shuffle header.
        478	    bin.read(new byte[headerSize], 0, headerSize);
        

        Move lines 477-478 inside the following try on 480.

        Drop fs.deleteOnExit. It comes too late in case there was an exception before, we should better do cleanup inside the tearDown method.

        491	    IFileInputStream iFin = new IFileInputStream(
        492	        new FileInputStream(shuffledToDisk.toString()), dataSize, job);
        

        It's probably better not to mix in java.io API if we already use Hadoop FileSystem API. Why not do:

        491	    IFileInputStream iFin = new IFileInputStream(fs.open(shuffledToDisk), dataSize, job); 
        
        493	    iFin.read(new byte[dataSize], 0, dataSize);
        494	    iFin.close();
        495	    fs.close();
        

        Make sure to put 493, 494 in try/finally, accordingly.
        Since we are getting rid of fs.deleteOnExit we don't need fs.close

        Show
        jira.shegalov Gera Shegalov added a comment - Thanks, It makes sense, Eric Payne . I just wanted to confirm that there are no existing tests catching missing checksum in on-disk shuffle ( confirmed! ). For the production code I have only one comment: I am now convinced FileSystem.getLocal(conf) should be FileSystem.getLocal(conf).getRaw() In the corresponding OnDiskMapOutput constructor. TestFetcher#testCorruptedFiles comments: Nit: Lower-case FETCHER because it's a final variable. FileSystem fs need to be changed FileSystem.getLocal(conf).getRaw() } just like in production code, and it could be made an instance variable because we can use it for cleanup in tearDown . Path p should be more mnemonic, and it it's better to use a root directory that matches the test method so we can use it for cleanup in tearDown Path outputPath = new Path(name.getMethodName() + "/foo" ); Instead of reverse-engineering the path shuffledToDisk , we can use Path shuffledToDisk = OnDiskMapOutput.getTempPath(outputPath, fetcher); 457 ios.write(mapData.getBytes()); 458 ios.close(); ios.close() should be in a finally block. 476 bin = new ByteArrayInputStream(corrupted); 477 // Read past the shuffle header. 478 bin.read( new byte [headerSize], 0, headerSize); Move lines 477-478 inside the following try on 480. Drop fs.deleteOnExit . It comes too late in case there was an exception before, we should better do cleanup inside the tearDown method. 491 IFileInputStream iFin = new IFileInputStream( 492 new FileInputStream(shuffledToDisk.toString()), dataSize, job); It's probably better not to mix in java.io API if we already use Hadoop FileSystem API. Why not do: 491 IFileInputStream iFin = new IFileInputStream(fs.open(shuffledToDisk), dataSize, job); 493 iFin.read( new byte [dataSize], 0, dataSize); 494 iFin.close(); 495 fs.close(); Make sure to put 493, 494 in try/finally, accordingly. Since we are getting rid of fs.deleteOnExit we don't need fs.close
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12685532/MAPREDUCE-6166-gera-missing-cs-test.patch
        against trunk revision 1b3bb9e.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 72 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5062//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5062//artifact/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5062//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685532/MAPREDUCE-6166-gera-missing-cs-test.patch against trunk revision 1b3bb9e. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 72 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5062//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5062//artifact/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5062//console This message is automatically generated.
        Hide
        eepayne Eric Payne added a comment -

        Thank you again, Gera Shegalov, for the very detailed analysis.

        OnDiskMapOutput: FileSystem.getLocal(conf) should be FileSystem.getLocal(conf).getRaw()

        Code changed.


        TestFetcher#testCorruptedFiles comments:

        1. Lower-case FETCHER
        2. Change FileSystem fs to FileSystem.getLocal(conf).getRaw()
        3. Move FileSystem fs to tearDown
        4. Rename Path p to be more descriptive
        5. Use name.getMethodName() to create a directory for creating onDiskMapOut file so it can be removed in tearDown
        6. Use OnDiskMapOutput.getTempPath() when assigning value to shuffledToDisk
        7. ios.close() should be in a finally block
        8. When creating and reading the corrupted ByteArrayInputStram put it into a try block
        9. Drop fs.deleteOnExit()
        10. When creating IFileInputStream, don't mix in java.io FileInputStream.
        11. Put close of IFileInputStream into a finally block.
        12. we don't need fs.close

          Above changes are all done as part of this patch.

        Show
        eepayne Eric Payne added a comment - Thank you again, Gera Shegalov , for the very detailed analysis. OnDiskMapOutput : FileSystem.getLocal(conf) should be FileSystem.getLocal(conf).getRaw() Code changed. TestFetcher#testCorruptedFiles comments: Lower-case FETCHER Change FileSystem fs to FileSystem.getLocal(conf).getRaw() Move FileSystem fs to tearDown Rename Path p to be more descriptive Use name.getMethodName() to create a directory for creating onDiskMapOut file so it can be removed in tearDown Use OnDiskMapOutput.getTempPath() when assigning value to shuffledToDisk ios.close() should be in a finally block When creating and reading the corrupted ByteArrayInputStram put it into a try block Drop fs.deleteOnExit() When creating IFileInputStream , don't mix in java.io FileInputStream . Put close of IFileInputStream into a finally block. we don't need fs.close Above changes are all done as part of this patch.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org
        against trunk revision 2e98ad3.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5071//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org against trunk revision 2e98ad3. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5071//console This message is automatically generated.
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Eric Payne, thanks for updating the patch. It might have gone stale. Please check and rebase it.

        The patch does not appear to apply with p0 to p2
        
        Show
        jira.shegalov Gera Shegalov added a comment - Eric Payne , thanks for updating the patch. It might have gone stale. Please check and rebase it. The patch does not appear to apply with p0 to p2
        Hide
        eepayne Eric Payne added a comment -

        Thanks Gera Shegalov. I went ahead and upmerged the patch and am attaching it here, but I'm pretty sure that didn't have anything to do with the "patch application" failures. In the console log (https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5071//console), I see a lot of "No such file or directory" errors, such as the following:

        /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/../patchprocess/jira: No such file or directory
        /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/../patchprocess/patch: No such file or directory
        cp: cannot stat '/home/jenkins/buildSupport/lib/*': No such file or directory
        

        etc...
        Hopefully the build environment is feeling better by now.

        Show
        eepayne Eric Payne added a comment - Thanks Gera Shegalov . I went ahead and upmerged the patch and am attaching it here, but I'm pretty sure that didn't have anything to do with the "patch application" failures. In the console log ( https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5071//console ), I see a lot of "No such file or directory" errors, such as the following: /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/../patchprocess/jira: No such file or directory /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/../patchprocess/patch: No such file or directory cp: cannot stat '/home/jenkins/buildSupport/lib/*': No such file or directory etc... Hopefully the build environment is feeling better by now.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12686618/MAPREDUCE-6166.v5.txt
        against trunk revision 8e9a266.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 13 new Findbugs (version 2.0.3) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5074//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5074//artifact/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5074//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12686618/MAPREDUCE-6166.v5.txt against trunk revision 8e9a266. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 13 new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5074//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5074//artifact/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5074//console This message is automatically generated.
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Thanks for updating the patch Eric Payne! +1 for v5. I'll commit on Monday.

        Show
        jira.shegalov Gera Shegalov added a comment - Thanks for updating the patch Eric Payne ! +1 for v5. I'll commit on Monday.
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Forgot to add that Findbugs report is not related to this patch.

        Show
        jira.shegalov Gera Shegalov added a comment - Forgot to add that Findbugs report is not related to this patch.
        Hide
        jira.shegalov Gera Shegalov added a comment -

        Committed to trunk and branch-2. Thanks, Eric Payne, for contribution, and Jason Lowe for additional review!

        Show
        jira.shegalov Gera Shegalov added a comment - Committed to trunk and branch-2. Thanks, Eric Payne , for contribution, and Jason Lowe for additional review!
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #6726 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6726/)
        MAPREDUCE-6166. Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
        • hadoop-mapreduce-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #6726 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6726/ ) MAPREDUCE-6166 . Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java hadoop-mapreduce-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #43 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/43/)
        MAPREDUCE-6166. Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
        • hadoop-mapreduce-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #43 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/43/ ) MAPREDUCE-6166 . Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java hadoop-mapreduce-project/CHANGES.txt
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk #777 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/777/)
        MAPREDUCE-6166. Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
        • hadoop-mapreduce-project/CHANGES.txt
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #777 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/777/ ) MAPREDUCE-6166 . Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #1994 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1994/)
        MAPREDUCE-6166. Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912)

        • hadoop-mapreduce-project/CHANGES.txt
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1994 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1994/ ) MAPREDUCE-6166 . Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912) hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #1975 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1975/)
        MAPREDUCE-6166. Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
        • hadoop-mapreduce-project/CHANGES.txt
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1975 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1975/ ) MAPREDUCE-6166 . Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #40 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/40/)
        MAPREDUCE-6166. Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912)

        • hadoop-mapreduce-project/CHANGES.txt
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #40 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/40/ ) MAPREDUCE-6166 . Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912) hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #44 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/44/)
        MAPREDUCE-6166. Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
        • hadoop-mapreduce-project/CHANGES.txt
        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #44 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/44/ ) MAPREDUCE-6166 . Reducers do not validate checksum of map outputs when fetching directly to disk. (Eric Payne via gera) (gera: rev af006937e8ba82f98f468dc7375fe89c2e0a7912) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java hadoop-mapreduce-project/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
        Hide
        eepayne Eric Payne added a comment -

        Thank you very much Gera Shegalov

        Show
        eepayne Eric Payne added a comment - Thank you very much Gera Shegalov
        Hide
        l201514 Siqi Li added a comment -

        The latest patch can be applied to 2.6.0 branch cleanly

        Show
        l201514 Siqi Li added a comment - The latest patch can be applied to 2.6.0 branch cleanly
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Pulled this into 2.6.1. Ran compilation and TestFetcher before the push. Patch applied cleanly.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Pulled this into 2.6.1. Ran compilation and TestFetcher before the push. Patch applied cleanly.

          People

          • Assignee:
            eepayne Eric Payne
            Reporter:
            eepayne Eric Payne
          • Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development