Hadoop Common
  1. Hadoop Common
  2. HADOOP-5664

Use of ReentrantLock.lock() in MapOutputBuffer takes up too much cpu time

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.19.1
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In examining a profile of one of my mappers today, I noticed that the method ReentrantLock.lock() in MapTask$MapOutputBuffer seems to be taking up ~11 seconds out of around 100 seconds total. It seems like 10% is an awfully large amount of time to spend in this lock.

      1. 5664-1.patch
        13 kB
        Chris Douglas
      2. 5664-0.patch
        12 kB
        Chris Douglas

        Activity

        Bryan Duxbury created issue -
        Bryan Duxbury made changes -
        Field Original Value New Value
        Affects Version/s 0.19.1 [ 12313473 ]
        Affects Version/s 0.18.3 [ 12313494 ]
        Environment Cloudera's 0.18.3 distribution
        Hide
        Chris Douglas added a comment -

        That does seem high, though I haven't seen comparable results for any mappers I've profiled. What is your map doing? Do your keys/values serialize using a series of small writes? Which JVM are you running?

        Show
        Chris Douglas added a comment - That does seem high, though I haven't seen comparable results for any mappers I've profiled. What is your map doing? Do your keys/values serialize using a series of small writes? Which JVM are you running?
        Hide
        Bryan Duxbury added a comment -

        I'm running Java 1.6. My map is converting ~50 byte records into less than 100 byte records. The writes are therefore probably on the order of no more than 50 bytes at a time.

        Show
        Bryan Duxbury added a comment - I'm running Java 1.6. My map is converting ~50 byte records into less than 100 byte records. The writes are therefore probably on the order of no more than 50 bytes at a time.
        Hide
        Chris Douglas added a comment -

        Such small writes are definitely a worst-case for the buffer, but I agree: 10% is steep. If the buffer could query the serialized size of the key and value then the locking could be relaxed somewhat, but given its current semantics I don't see a clear way to dispose of the lock acquisition for each write without introducing additional copies.

        How invasive is the profiling? Is it possible that the overhead in tracing the call to lock() is responsible for some of the 11s overhead? Is the collection thread ever blocked waiting for the spill to complete?

        Show
        Chris Douglas added a comment - Such small writes are definitely a worst-case for the buffer, but I agree: 10% is steep. If the buffer could query the serialized size of the key and value then the locking could be relaxed somewhat, but given its current semantics I don't see a clear way to dispose of the lock acquisition for each write without introducing additional copies. How invasive is the profiling? Is it possible that the overhead in tracing the call to lock() is responsible for some of the 11s overhead? Is the collection thread ever blocked waiting for the spill to complete?
        Hide
        Raghu Angadi added a comment -

        Will it be less of a problem if the class used simple synchronized block instead of a ReentrantLock? It looks like it can be done with synchronized blocks.

        Show
        Raghu Angadi added a comment - Will it be less of a problem if the class used simple synchronized block instead of a ReentrantLock? It looks like it can be done with synchronized blocks.
        Hide
        Bryan Duxbury added a comment -

        I'm using YourKit in sampling mode, so the overhead incurred by the profiler should be close to zero.

        I understand why we should try to avoid extra buffers in general, but is this a place where we could introduce a configurable buffer? Maybe if the count is set to 1, we skip the buffer, otherwise push to the spill thread only every N writes?

        I'm not sure if synchronized() will solve this problem or not. Is there an easy way to test this?

        Show
        Bryan Duxbury added a comment - I'm using YourKit in sampling mode, so the overhead incurred by the profiler should be close to zero. I understand why we should try to avoid extra buffers in general, but is this a place where we could introduce a configurable buffer? Maybe if the count is set to 1, we skip the buffer, otherwise push to the spill thread only every N writes? I'm not sure if synchronized() will solve this problem or not. Is there an easy way to test this?
        Hide
        Raghu Angadi added a comment -

        > I'm not sure if synchronized() will solve this problem or not. Is there an easy way to test this?

        You could add an extra 'synchronized(tempObject)' around 'lock() and unlock()' in write() to see if that noticeably changes profile numbers for write(). If that looks promising we could try a real patch that replaces the ReentrantLock.

        Show
        Raghu Angadi added a comment - > I'm not sure if synchronized() will solve this problem or not. Is there an easy way to test this? You could add an extra 'synchronized(tempObject)' around 'lock() and unlock()' in write() to see if that noticeably changes profile numbers for write(). If that looks promising we could try a real patch that replaces the ReentrantLock.
        Hide
        Chris Douglas added a comment -

        Will it be less of a problem if the class used simple synchronized block instead of a ReentrantLock? It looks like it can be done with synchronized blocks.

        This is how it worked prior to HADOOP-3617. ReentrantLock is supposed to be a more lightweight locking mechanism, but in MapOutputBuffer there is very little contention for the lock, so it's possible that "heavier" mechanisms may actually take less time. I wouldn't expect there to be a dramatic difference in execution time, though. The principal cost is inherent in the design, which requires coordination between the writer and the spill thread. Given the structure of the serialization, the MapOutputBuffer requests the given record to serialize itself into an internal buffer, which it effects through a series of writes back into the data structure. Since the record size is unknown a priori, the writer must obtain a lock on the structure because it's possible that the in-memory structure must be spilled to accommodate the writes composing the record, that it must wait for the spill to complete, etc.

        There are also some subtle points when starting/completing the spill thread that are also more easily expressed in ReentrantLocks.

        I understand why we should try to avoid extra buffers in general, but is this a place where we could introduce a configurable buffer? Maybe if the count is set to 1, we skip the buffer, otherwise push to the spill thread only every N writes?

        How would we size this buffer, unless the user promises to give us fixed-size records? We could allocate extents for MapOutputBuffer and use queues to push them between the collection and spill threads. That might actually be a clearer way to express some of the logic in MapTask, though the serialization becomes slightly more complex and we'd add a layer of indirection.

        Show
        Chris Douglas added a comment - Will it be less of a problem if the class used simple synchronized block instead of a ReentrantLock? It looks like it can be done with synchronized blocks. This is how it worked prior to HADOOP-3617 . ReentrantLock is supposed to be a more lightweight locking mechanism, but in MapOutputBuffer there is very little contention for the lock, so it's possible that "heavier" mechanisms may actually take less time. I wouldn't expect there to be a dramatic difference in execution time, though. The principal cost is inherent in the design, which requires coordination between the writer and the spill thread. Given the structure of the serialization, the MapOutputBuffer requests the given record to serialize itself into an internal buffer, which it effects through a series of writes back into the data structure. Since the record size is unknown a priori, the writer must obtain a lock on the structure because it's possible that the in-memory structure must be spilled to accommodate the writes composing the record, that it must wait for the spill to complete, etc. There are also some subtle points when starting/completing the spill thread that are also more easily expressed in ReentrantLocks. I understand why we should try to avoid extra buffers in general, but is this a place where we could introduce a configurable buffer? Maybe if the count is set to 1, we skip the buffer, otherwise push to the spill thread only every N writes? How would we size this buffer, unless the user promises to give us fixed-size records? We could allocate extents for MapOutputBuffer and use queues to push them between the collection and spill threads. That might actually be a clearer way to express some of the logic in MapTask, though the serialization becomes slightly more complex and we'd add a layer of indirection.
        Chris Douglas made changes -
        Assignee Chris Douglas [ chris.douglas ]
        Hide
        Chris Douglas added a comment -

        Discussing this with Owen, he had an idea that has all the properties of the extent design without requiring a major rewrite. Since the collection thread can determine when it next needs to validate the state of the buffer, it can freely write up to that boundary.

        Attached is a preliminary patch that removes not only some of the excessive spill synchronization, but it also drops the synchronization on MapOutputBuffer.Buffer methods. Since the buffer is only really protected by the lock on collect, synchronizing the buffer methods can only make potential corruption well-ordered.

        Bryan: would you mind testing this with your job? I'm still working through its correctness (it passes unit tests, at least), but I'm curious about its effect in your environment.

        Show
        Chris Douglas added a comment - Discussing this with Owen, he had an idea that has all the properties of the extent design without requiring a major rewrite. Since the collection thread can determine when it next needs to validate the state of the buffer, it can freely write up to that boundary. Attached is a preliminary patch that removes not only some of the excessive spill synchronization, but it also drops the synchronization on MapOutputBuffer.Buffer methods. Since the buffer is only really protected by the lock on collect, synchronizing the buffer methods can only make potential corruption well-ordered. Bryan: would you mind testing this with your job? I'm still working through its correctness (it passes unit tests, at least), but I'm curious about its effect in your environment.
        Chris Douglas made changes -
        Attachment 5664-0.patch [ 12406339 ]
        Chris Douglas made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12406339/5664-0.patch
        against trunk revision 771231.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/286/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/286/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/286/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/286/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12406339/5664-0.patch against trunk revision 771231. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/286/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/286/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/286/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/286/console This message is automatically generated.
        Hide
        Chris Douglas added a comment -

        Moved assignment out of expression, added a comment. No functional changes.

        Show
        Chris Douglas added a comment - Moved assignment out of expression, added a comment. No functional changes.
        Chris Douglas made changes -
        Attachment 5664-1.patch [ 12409230 ]
        Hide
        Chris Douglas added a comment -

        Re-ran test-patch:

             [exec] -1 overall.
             [exec]
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec]
             [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
             [exec]                         Please justify why no tests are needed for this patch.
             [exec]
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec]
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec]
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec]
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
             [exec]
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
        

        All the mapred tests pass. The test that failed in the original Hudson run was not related.

        Show
        Chris Douglas added a comment - Re-ran test-patch: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. All the mapred tests pass. The test that failed in the original Hudson run was not related.
        Hide
        Arun C Murthy added a comment -

        +1

        Show
        Arun C Murthy added a comment - +1
        Hide
        Chris Douglas added a comment -

        I committed this.

        Show
        Chris Douglas added a comment - I committed this.
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Fix Version/s 0.21.0 [ 12313563 ]
        Resolution Fixed [ 1 ]
        Chris Douglas made changes -
        Issue Type Bug [ 1 ] Improvement [ 4 ]
        Owen O'Malley made changes -
        Component/s mapred [ 12310690 ]
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Chris Douglas
            Reporter:
            Bryan Duxbury
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development