Hadoop Common
  1. Hadoop Common
  2. HADOOP-3398

ReduceTask::closestPowerOf2 is inefficient

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.18.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      ReduceTask computes the "closest power of 2" using loops

      private static int getClosestPowerOf2(int value) {
        int power = 0;
        int approx = 1;
        while (approx < value) {
          ++power;
          approx = (approx << 1);
        }
        if ((value - (approx >> 1)) < (approx - value)) {
          --power;
        }
        return power;
      }
      

      This could be improved.

      1. 3398-3.patch
        1 kB
        Chris Douglas
      2. 3398-2.patch
        1 kB
        Chris Douglas
      3. 3398-1.patch
        0.7 kB
        Chris Douglas
      4. 3398-0.patch
        0.7 kB
        Chris Douglas

        Activity

        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #499 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/499/ )
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12382331/3398-3.patch
        against trunk revision 657965.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2500/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2500/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2500/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2500/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12382331/3398-3.patch against trunk revision 657965. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2500/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2500/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2500/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2500/console This message is automatically generated.
        Hide
        Chris Douglas added a comment -

        I just committed this.

        Show
        Chris Douglas added a comment - I just committed this.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        +1 3398-3.patch looks good

        Show
        Tsz Wo Nicholas Sze added a comment - +1 3398-3.patch looks good
        Hide
        Chris Douglas added a comment -

        Alright, alright, you pedants.

        Show
        Chris Douglas added a comment - Alright, alright, you pedants.
        Hide
        Amar Kamat added a comment -

        Mathematically, we should throw an exception (Math.log() returns NaN for -ve numbers and -infinity for 0) when value <= 0 since logarithm is undefined.

        +1.

        Show
        Amar Kamat added a comment - Mathematically, we should throw an exception (Math.log() returns NaN for -ve numbers and -infinity for 0) when value <= 0 since logarithm is undefined. +1.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12382208/3398-2.patch
        against trunk revision 656939.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2493/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2493/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2493/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2493/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12382208/3398-2.patch against trunk revision 656939. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2493/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2493/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2493/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2493/console This message is automatically generated.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Mathematically, we should throw an exception (Math.log() returns NaN for -ve numbers and -infinity for 0) when value <= 0 since logarithm is undefined.

        Show
        Tsz Wo Nicholas Sze added a comment - Mathematically, we should throw an exception (Math.log() returns NaN for -ve numbers and -infinity for 0) when value <= 0 since logarithm is undefined.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        if (value < 0), why not return -getClosestPowerOf2(-value-1)?

        Show
        Tsz Wo Nicholas Sze added a comment - if (value < 0), why not return -getClosestPowerOf2(-value-1)?
        Hide
        Chris Douglas added a comment -

        Thanks for the review and benchmarks. It's obvious that this isn't an important optimization (hence "trivial"), but it was a fun problem.

        Added some documentation and an explicit check for value leq 0 to make its behavior clearer.

        Show
        Chris Douglas added a comment - Thanks for the review and benchmarks. It's obvious that this isn't an important optimization (hence "trivial"), but it was a fun problem. Added some documentation and an explicit check for value leq 0 to make its behavior clearer.
        Hide
        Amar Kamat added a comment -

        Code looks fine. The values are same for both the implementations. Some comments as to what is done could be useful. I compared the two implementations

        input num-iterations before (milli) after (milli)
        10000 1000000 642 19
        5000 1000000 78 9
        2500 1000000 74 5
        1000 1000000 69 3
        500 1000000 51 1
        300 1000000 49 1

        Assuming that the map runtime can vary from 300sec to 10,000 secs and the maximum number of times getClosestPowerOf2() will be invoked is 1,000,000.

        Show
        Amar Kamat added a comment - Code looks fine. The values are same for both the implementations. Some comments as to what is done could be useful. I compared the two implementations input num-iterations before (milli) after (milli) 10000 1000000 642 19 5000 1000000 78 9 2500 1000000 74 5 1000 1000000 69 3 500 1000000 51 1 300 1000000 49 1 Assuming that the map runtime can vary from 300sec to 10,000 secs and the maximum number of times getClosestPowerOf2() will be invoked is 1,000,000.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12382146/3398-1.patch
        against trunk revision 656852.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2482/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2482/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2482/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2482/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12382146/3398-1.patch against trunk revision 656852. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2482/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2482/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2482/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2482/console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12382141/3398-0.patch
        against trunk revision 656852.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2481/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2481/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2481/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2481/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12382141/3398-0.patch against trunk revision 656852. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2481/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2481/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2481/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2481/console This message is automatically generated.
        Hide
        Chris Douglas added a comment -

        This returns the same results as the original, save for values <= 0, which neither support.

        Show
        Chris Douglas added a comment - This returns the same results as the original, save for values <= 0, which neither support.

          People

          • Assignee:
            Chris Douglas
            Reporter:
            Chris Douglas
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development