Hadoop Common
  1. Hadoop Common
  2. HADOOP-7183

WritableComparator.get should not cache comparator objects

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.21.1, 0.22.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      HADOOP-6881 modified WritableComparator.get such that the constructed WritableComparator gets saved back into the static map. This is fine for stateless comparators, but some comparators have per-instance state, and thus this becomes thread-unsafe and causes errors in the shuffle where multiple threads are doing comparisons. An example of a Comparator with per-instance state is WritableComparator itself.

      1. HADOOP-7183.patch
        2 kB
        Tom White
      2. HADOOP-7183.patch
        0.9 kB
        Tom White

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #680 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk/680/)
          HADOOP-7183. WritableComparator.get should not cache comparator objects. Contributed by Tom White

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #680 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk/680/ ) HADOOP-7183 . WritableComparator.get should not cache comparator objects. Contributed by Tom White
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-22-branch #44 (See https://builds.apache.org/hudson/job/Hadoop-Common-22-branch/44/)
          HADOOP-7183. svn merge -c 1100056 from trunk

          Show
          Hudson added a comment - Integrated in Hadoop-Common-22-branch #44 (See https://builds.apache.org/hudson/job/Hadoop-Common-22-branch/44/ ) HADOOP-7183 . svn merge -c 1100056 from trunk
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #575 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk-Commit/575/)
          HADOOP-7183. WritableComparator.get should not cache comparator objects. Contributed by Tom White

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #575 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk-Commit/575/ ) HADOOP-7183 . WritableComparator.get should not cache comparator objects. Contributed by Tom White
          Hide
          Eli Collins added a comment -

          I've committed this to trunk, branch 22, and branch 21. Thanks Tom!

          Show
          Eli Collins added a comment - I've committed this to trunk, branch 22, and branch 21. Thanks Tom!
          Hide
          Todd Lipcon added a comment -

          +1

          Show
          Todd Lipcon added a comment - +1
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12477363/HADOOP-7183.patch
          against trunk revision 1096522.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/376//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/376//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/376//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12477363/HADOOP-7183.patch against trunk revision 1096522. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/376//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/376//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/376//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12477363/HADOOP-7183.patch
          against trunk revision 1096522.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/377//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/377//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/377//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12477363/HADOOP-7183.patch against trunk revision 1096522. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/377//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/377//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/377//console This message is automatically generated.
          Hide
          Tom White added a comment -

          Removed test.

          Show
          Tom White added a comment - Removed test.
          Hide
          Owen O'Malley added a comment -

          The test needs to be removed from the fix, since it is only testing the implementation detail of this workaround fix.

          Tests need to reflect the correct semantics of the code, not implementation details that will be removed when someone has time to do the right fix.

          Show
          Owen O'Malley added a comment - The test needs to be removed from the fix, since it is only testing the implementation detail of this workaround fix. Tests need to reflect the correct semantics of the code, not implementation details that will be removed when someone has time to do the right fix.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12473422/HADOOP-7183.patch
          against trunk revision 1094750.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/359//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/359//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/359//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12473422/HADOOP-7183.patch against trunk revision 1094750. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/359//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/359//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/359//console This message is automatically generated.
          Hide
          Tom White added a comment -

          Yes, I think this can be committed now, since it is sufficient for restoring the previous behaviour. I've opened HADOOP-7219 for adding support for ThreadLocals.

          Show
          Tom White added a comment - Yes, I think this can be committed now, since it is sufficient for restoring the previous behaviour. I've opened HADOOP-7219 for adding support for ThreadLocals.
          Hide
          Nigel Daley added a comment -

          Tom, any update on this for 0.22?

          Show
          Nigel Daley added a comment - Tom, any update on this for 0.22?
          Hide
          Owen O'Malley added a comment -

          6881 changed it to add comparators.put there.

          Ah, right. Being too efficient. grin

          Making WritableComparator thread-safe by using a thread-local buffer would be a good enhancement but isn't necessary for this fix.

          This fix is brittle precisely because the expectation is that comparators are stateless. It looks like the shuffle doesn't reuse comparators between threads, but it is much much easier to make them thread-safe than ensure that no one puts in such a use case.

          Show
          Owen O'Malley added a comment - 6881 changed it to add comparators.put there. Ah, right. Being too efficient. grin Making WritableComparator thread-safe by using a thread-local buffer would be a good enhancement but isn't necessary for this fix. This fix is brittle precisely because the expectation is that comparators are stateless. It looks like the shuffle doesn't reuse comparators between threads, but it is much much easier to make them thread-safe than ensure that no one puts in such a use case.
          Hide
          Tom White added a comment -

          Before HADOOP-6881 the only way to add a WritableComparator to the cache was by explicitly calling WritableComparator.define(). HADOOP-6881 changed WritableComparator.get() to add a generic WritableComparator to the comparators cache when there was none registered for the class. This patch simply restores the previous behaviour.

          Making WritableComparator thread-safe by using a thread-local buffer would be a good enhancement but isn't necessary for this fix.

          Show
          Tom White added a comment - Before HADOOP-6881 the only way to add a WritableComparator to the cache was by explicitly calling WritableComparator.define(). HADOOP-6881 changed WritableComparator.get() to add a generic WritableComparator to the comparators cache when there was none registered for the class. This patch simply restores the previous behaviour. Making WritableComparator thread-safe by using a thread-local buffer would be a good enhancement but isn't necessary for this fix.
          Hide
          Todd Lipcon added a comment -

          This behavior wasn't changed by HADOOP-6881

          It looks to me like it was. Prior to 6881, if nothing was in the cache, it did not "write back" to the cache in WritableComparator.get(). 6881 changed it to add comparators.put there.

          Have you actually verified that the shuffle's sorts actually do a get per a thread?

          Not in the latest shuffle code, but it looked that way to me previously.

          Using ThreadLocals seems like a reasonable alternate implementation. But we should still make the contract explicit that cached comparators must be threadsafe (yes, it's fairly obvious, but we seem to have missed it ourselves!)

          Show
          Todd Lipcon added a comment - This behavior wasn't changed by HADOOP-6881 It looks to me like it was. Prior to 6881, if nothing was in the cache, it did not "write back" to the cache in WritableComparator.get(). 6881 changed it to add comparators.put there. Have you actually verified that the shuffle's sorts actually do a get per a thread? Not in the latest shuffle code, but it looked that way to me previously. Using ThreadLocals seems like a reasonable alternate implementation. But we should still make the contract explicit that cached comparators must be threadsafe (yes, it's fairly obvious, but we seem to have missed it ourselves!)
          Hide
          Owen O'Malley added a comment -

          I'm not very happy with this fix. The implicit contract is that all comparators in the cache must be thread safe. To special case the WritableComparator case because it isn't thread safe seems like the wrong direction. It seems far less brittle to use a thread local buffer for the WritableComparator.

          Have you actually verified that the shuffle's sorts actually do a get per a thread?

          Show
          Owen O'Malley added a comment - I'm not very happy with this fix. The implicit contract is that all comparators in the cache must be thread safe. To special case the WritableComparator case because it isn't thread safe seems like the wrong direction. It seems far less brittle to use a thread local buffer for the WritableComparator. Have you actually verified that the shuffle's sorts actually do a get per a thread?
          Hide
          Owen O'Malley added a comment -

          This behavior wasn't changed by HADOOP-6881, so it has been there for a long time.

          Show
          Owen O'Malley added a comment - This behavior wasn't changed by HADOOP-6881 , so it has been there for a long time.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12473422/HADOOP-7183.patch
          against trunk revision 1080723.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/308//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/308//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/308//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12473422/HADOOP-7183.patch against trunk revision 1080723. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/308//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/308//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/308//console This message is automatically generated.
          Hide
          Doug Cutting added a comment -

          +1 patch looks good to me too.

          Show
          Doug Cutting added a comment - +1 patch looks good to me too.
          Hide
          Todd Lipcon added a comment -

          +1 pending test results

          Show
          Todd Lipcon added a comment - +1 pending test results
          Hide
          Tom White added a comment -

          The problem is that WritableComparator has a mutable field - DataInputBuffer buffer - which is only used by WritableComparable implementations that don't override the optimized binary compare method. IntWritable, Text, etc override this method, so there is no thread safety issue for these.

          The remedy is to only register comparators explicitly, i.e. not the "generic" ones, since they may not be thread-safe. This is actually the behaviour that was in place before HADOOP-6881.

          I've also updated the javadoc for WritableComparator.define to clarify that it should only be called for thread-safe classes.

          Show
          Tom White added a comment - The problem is that WritableComparator has a mutable field - DataInputBuffer buffer - which is only used by WritableComparable implementations that don't override the optimized binary compare method. IntWritable, Text, etc override this method, so there is no thread safety issue for these. The remedy is to only register comparators explicitly, i.e. not the "generic" ones, since they may not be thread-safe. This is actually the behaviour that was in place before HADOOP-6881 . I've also updated the javadoc for WritableComparator.define to clarify that it should only be called for thread-safe classes.
          Hide
          Doug Cutting added a comment -

          The fix should probably also be merged to the 0.20 and 0.21 branches.

          Show
          Doug Cutting added a comment - The fix should probably also be merged to the 0.20 and 0.21 branches.

            People

            • Assignee:
              Tom White
              Reporter:
              Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development