Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-825

JobClient completion poll interval of 5s causes slow tests in local mode

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The JobClient.NetworkedJob.waitForCompletion() method polls for job completion every 5 seconds. When running a set of short tests in pseudo-distributed mode, this is unnecessarily slow and causes lots of wasted time. When bandwidth is not scarce, setting the poll interval to 100 ms results in a 4x speedup in some tests. This interval should be parametrized to allow users to control the interval for testing purposes.

      1. completion-poll-interval.patch
        3 kB
        Aaron Kimball
      2. MAPREDUCE-825.2.patch
        6 kB
        Aaron Kimball

        Activity

        Aaron Kimball created issue -
        Hide
        Aaron Kimball added a comment -

        This patch adds the jobclient.completion.poll.interval mapreduce parameter, which defaults to the existing value of 5000 ms. No new tests because this is a very small change; it's not clear what new (or any) functionality needs additional verification.

        Show
        Aaron Kimball added a comment - This patch adds the jobclient.completion.poll.interval mapreduce parameter, which defaults to the existing value of 5000 ms. No new tests because this is a very small change; it's not clear what new (or any) functionality needs additional verification.
        Aaron Kimball made changes -
        Field Original Value New Value
        Attachment completion-poll-interval.patch [ 12415515 ]
        Aaron Kimball made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Vinod Kumar Vavilapalli added a comment -

        Now that we are doing this, shouldn't we also do something for the hard-coded sleep interval in monitorAndPrintJob() (JobClient.java +1294)? That might help in some test-cases too..

        Regarding tests, you can change any of the current test cases to use a lower value for the introduced configuration and verify manually that the test time has decreased. I could see TestTaskFail using three instances of waitForCompletion() in a single test, may be you can change that test-case and see how it goes.

        Show
        Vinod Kumar Vavilapalli added a comment - Now that we are doing this, shouldn't we also do something for the hard-coded sleep interval in monitorAndPrintJob() (JobClient.java +1294)? That might help in some test-cases too.. Regarding tests, you can change any of the current test cases to use a lower value for the introduced configuration and verify manually that the test time has decreased. I could see TestTaskFail using three instances of waitForCompletion() in a single test, may be you can change that test-case and see how it goes.
        Hide
        Aaron Kimball added a comment -

        Agreed about the other timeout. I'll include that as well.

        I can change the suggested test, but as you say, it'll have to be a manual verification and that'll itself be subjective. Thread.sleep guarantees that a thread will sleep for at least 'n' milliseconds, or receive an InterruptedException. But the OS is always free to reschedule your thread later, and there are a lot of other variable-time components in these tests. So writing JUnit tests that compare relative runtimes might nondeterministically fail.

        I'll report back results.

        Show
        Aaron Kimball added a comment - Agreed about the other timeout. I'll include that as well. I can change the suggested test, but as you say, it'll have to be a manual verification and that'll itself be subjective. Thread.sleep guarantees that a thread will sleep for at least 'n' milliseconds, or receive an InterruptedException. But the OS is always free to reschedule your thread later, and there are a lot of other variable-time components in these tests. So writing JUnit tests that compare relative runtimes might nondeterministically fail. I'll report back results.
        Aaron Kimball made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Aaron Kimball added a comment -

        Attaching a new patch that also includes a new "jobclient.progress.monitor.poll.interval" setting; default is 1000 ms. Modified TestTaskFail to set the completion poll interval to 50 ms.

        With default (5000) ms timeout, test runtime was 3 minutes 15 seconds. Setting the timeout to 50 ms reduced test runtime to 3 minutes 8 seconds. If we expect an average of 2500 milliseconds wasted per job in the default case, then this is 2500*3 = 7500 ms expected to be wasted, so the observed speedup seems correct. To be sure, I also set the timeout to 20000 ms; test runtime went up to 3 minutes 52 seconds. So there's definitely a correlation.

        Show
        Aaron Kimball added a comment - Attaching a new patch that also includes a new "jobclient.progress.monitor.poll.interval" setting; default is 1000 ms. Modified TestTaskFail to set the completion poll interval to 50 ms. With default (5000) ms timeout, test runtime was 3 minutes 15 seconds. Setting the timeout to 50 ms reduced test runtime to 3 minutes 8 seconds. If we expect an average of 2500 milliseconds wasted per job in the default case, then this is 2500*3 = 7500 ms expected to be wasted, so the observed speedup seems correct. To be sure, I also set the timeout to 20000 ms; test runtime went up to 3 minutes 52 seconds. So there's definitely a correlation.
        Aaron Kimball made changes -
        Attachment MAPREDUCE-825.2.patch [ 12415772 ]
        Aaron Kimball made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12415772/MAPREDUCE-825.2.patch
        against trunk revision 801959.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415772/MAPREDUCE-825.2.patch against trunk revision 801959. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/454/console This message is automatically generated.
        Hide
        Aaron Kimball added a comment -

        Failures are in streaming only.

        Show
        Aaron Kimball added a comment - Failures are in streaming only.
        Hide
        Todd Lipcon added a comment -

        Patch looks good to me. +1

        Show
        Todd Lipcon added a comment - Patch looks good to me. +1
        Hide
        Tom White added a comment -

        I've just committed this. Thanks Aaron!

        Show
        Tom White added a comment - I've just committed this. Thanks Aaron!
        Tom White made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Fix Version/s 0.21.0 [ 12314045 ]
        Resolution Fixed [ 1 ]
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Patch Available Patch Available Open Open
        1d 22h 19m 1 Aaron Kimball 06/Aug/09 20:14
        Open Open Patch Available Patch Available
        5m 1s 2 Aaron Kimball 06/Aug/09 20:17
        Patch Available Patch Available Resolved Resolved
        19d 14h 11m 1 Tom White 26/Aug/09 10:28
        Resolved Resolved Closed Closed
        363d 11h 46m 1 Tom White 24/Aug/10 22:15

          People

          • Assignee:
            Aaron Kimball
            Reporter:
            Aaron Kimball
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development