Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-2762

Use better compiler optimizations by default for native maps

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.5.1, 1.6.0
    • 1.5.2, 1.6.1, 1.7.0
    • native

    Description

      A comment on ACCUMULO-2749 identified improved performance with better compiler optimizations for native maps (-O3 instead of -O2).

      The native shared library is small, and enabling these additional optimizations will not result in a significantly slower build, so I don't see any reason why we should not enable better optimizations by default in the Makefile, if we can identify them.

      Attachments

        1. ACCUMULO-2762.v1.patch
          1.0 kB
          Christopher Tubbs

        Issue Links

          Activity

            afuchs Adam Fuchs added a comment -

            If -O3 is better than -O2, maybe we should really be using -O4 or -O5!

            afuchs Adam Fuchs added a comment - If -O3 is better than -O2, maybe we should really be using -O4 or -O5!

            I don't know which optimizations are available on all other platforms/compilers/environments, but on gcc 4.4.7 on CentOS, the difference between -O2 and -O3 are that -O3 includes -O2 optimizations, plus -fgcse-after-reload -finline-functions -fipa-cp-clone -fpredictive-commoning -ftree-vectorize -funswitch-loops

            On gcc 4.8.2 on Fedora 19 and 20, -O3 also includes -ftree-loop-distribute-patterns -ftree-partial-pre -fvect-cost-model

            In both cases, -O4 and higher are just aliases for -O3. I assume it's the same for other versions of gcc in other environments as well, so I can't tell if the suggestion to use -O4 or -O5 was a serious one. As far as I know, in general, the tradeoffs for enabling these optimizations are compile time and final library size. Since our native code is small, neither of those are very concerning, and we should enable as many optimizations as are available, unless there's a specific reason to exclude some.

            (gcc 4.8.2 offers a lot more optimizations with -O2 than does 4.4.7 with -O2, so the larger difference is probably the compiler version used to build the native libraries, than the difference betwen -O2 and -O3 in a given version.)

            ctubbsii Christopher Tubbs added a comment - I don't know which optimizations are available on all other platforms/compilers/environments, but on gcc 4.4.7 on CentOS, the difference between -O2 and -O3 are that -O3 includes -O2 optimizations, plus -fgcse-after-reload -finline-functions -fipa-cp-clone -fpredictive-commoning -ftree-vectorize -funswitch-loops On gcc 4.8.2 on Fedora 19 and 20, -O3 also includes -ftree-loop-distribute-patterns -ftree-partial-pre -fvect-cost-model In both cases, -O4 and higher are just aliases for -O3. I assume it's the same for other versions of gcc in other environments as well, so I can't tell if the suggestion to use -O4 or -O5 was a serious one. As far as I know, in general, the tradeoffs for enabling these optimizations are compile time and final library size. Since our native code is small, neither of those are very concerning, and we should enable as many optimizations as are available, unless there's a specific reason to exclude some. (gcc 4.8.2 offers a lot more optimizations with -O2 than does 4.4.7 with -O2, so the larger difference is probably the compiler version used to build the native libraries, than the difference betwen -O2 and -O3 in a given version.)

            Patch ACCUMULO-2762.v1.patch uses max optimizations by default.

            I'm uncertain why we were disabling the omit-frame-pointer and strict-aliasing optimizations. Maybe for debugging purposes? If that's the case, they should be omitted by default.

            If they need to be there, then somebody who knows why they should be, should verify this patch. I won't apply it immediately, to give anybody who may know, some time to discuss.

            ctubbsii Christopher Tubbs added a comment - Patch ACCUMULO-2762.v1.patch uses max optimizations by default. I'm uncertain why we were disabling the omit-frame-pointer and strict-aliasing optimizations. Maybe for debugging purposes? If that's the case, they should be omitted by default. If they need to be there, then somebody who knows why they should be, should verify this patch. I won't apply it immediately, to give anybody who may know, some time to discuss.
            elserj Josh Elser added a comment -

            I'm no expert, but, as I understand the gcc options, this patch LGTM.

            elserj Josh Elser added a comment - I'm no expert, but, as I understand the gcc options, this patch LGTM.

            Commit c16f105824ff5e4d56105b9b3cc20d866b0e3b8d in accumulo's branch refs/heads/1.5.2-SNAPSHOT from ctubbsii
            [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=c16f105 ]

            ACCUMULO-2762 Use more compiler optimizations for 1.5 native lib

            jira-bot ASF subversion and git services added a comment - Commit c16f105824ff5e4d56105b9b3cc20d866b0e3b8d in accumulo's branch refs/heads/1.5.2-SNAPSHOT from ctubbsii [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=c16f105 ] ACCUMULO-2762 Use more compiler optimizations for 1.5 native lib

            Commit c16f105824ff5e4d56105b9b3cc20d866b0e3b8d in accumulo's branch refs/heads/1.6.1-SNAPSHOT from ctubbsii
            [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=c16f105 ]

            ACCUMULO-2762 Use more compiler optimizations for 1.5 native lib

            jira-bot ASF subversion and git services added a comment - Commit c16f105824ff5e4d56105b9b3cc20d866b0e3b8d in accumulo's branch refs/heads/1.6.1-SNAPSHOT from ctubbsii [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=c16f105 ] ACCUMULO-2762 Use more compiler optimizations for 1.5 native lib

            Commit 10ce7af89ddaaef1863eed789a6df67946e09c59 in accumulo's branch refs/heads/1.6.1-SNAPSHOT from ctubbsii
            [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=10ce7af ]

            ACCUMULO-2762 Use all optimizations by default

            jira-bot ASF subversion and git services added a comment - Commit 10ce7af89ddaaef1863eed789a6df67946e09c59 in accumulo's branch refs/heads/1.6.1-SNAPSHOT from ctubbsii [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=10ce7af ] ACCUMULO-2762 Use all optimizations by default

            Commit c16f105824ff5e4d56105b9b3cc20d866b0e3b8d in accumulo's branch refs/heads/master from ctubbsii
            [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=c16f105 ]

            ACCUMULO-2762 Use more compiler optimizations for 1.5 native lib

            jira-bot ASF subversion and git services added a comment - Commit c16f105824ff5e4d56105b9b3cc20d866b0e3b8d in accumulo's branch refs/heads/master from ctubbsii [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=c16f105 ] ACCUMULO-2762 Use more compiler optimizations for 1.5 native lib

            Commit 10ce7af89ddaaef1863eed789a6df67946e09c59 in accumulo's branch refs/heads/master from ctubbsii
            [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=10ce7af ]

            ACCUMULO-2762 Use all optimizations by default

            jira-bot ASF subversion and git services added a comment - Commit 10ce7af89ddaaef1863eed789a6df67946e09c59 in accumulo's branch refs/heads/master from ctubbsii [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=10ce7af ] ACCUMULO-2762 Use all optimizations by default

            It was advised to me by a Fedora developer that -fno-strict-aliasing should be left in there, because it could cause problems. I'm reopening this ticket for now, until I (or someone else) can investigate this.

            ctubbsii Christopher Tubbs added a comment - It was advised to me by a Fedora developer that -fno-strict-aliasing should be left in there, because it could cause problems. I'm reopening this ticket for now, until I (or someone else) can investigate this.
            elserj Josh Elser added a comment -

            it could cause problems

            That's adequately, vaguely obtuse... Context?

            elserj Josh Elser added a comment - it could cause problems That's adequately, vaguely obtuse... Context?

            That's all I was told. The details were not conveyed to me, so I don't know what kind of problems it could cause. Hence, the comment about investigating. I'm not actually sure what strict aliasing is, or why we'd disable that optimization, the risks/benefits of it. My guess is that an internet search would reveal some discussion about the pros and cons about using strict aliasing.

            ctubbsii Christopher Tubbs added a comment - That's all I was told. The details were not conveyed to me, so I don't know what kind of problems it could cause. Hence, the comment about investigating. I'm not actually sure what strict aliasing is, or why we'd disable that optimization, the risks/benefits of it. My guess is that an internet search would reveal some discussion about the pros and cons about using strict aliasing.
            elserj Josh Elser added a comment -

            Ah, I figured someone giving such a recommendation would at least provide some context too. Never mind, then.

            elserj Josh Elser added a comment - Ah, I figured someone giving such a recommendation would at least provide some context too. Never mind, then.

            Additional details:

            (esp. -fno-strict-aliasing, removing it could be dangerous)
            in general it's good not to have it as gcc can generate better coda
            but this can break buggy software

            But I have no idea what this means, or how it can break.

            ctubbsii Christopher Tubbs added a comment - Additional details: (esp. -fno-strict-aliasing, removing it could be dangerous) in general it's good not to have it as gcc can generate better coda but this can break buggy software But I have no idea what this means, or how it can break.
            ecn Eric C. Newton added a comment -

            Google turns up this longer explanation .

            The native map should not be considered legacy code, and any bugs we have related to strict aliasing should be fixed. We should remove the flag, though I doubt it has any performance impact.

            I don't know who first added the flag, or why. It has been there prior to incubation.

            ecn Eric C. Newton added a comment - Google turns up this longer explanation . The native map should not be considered legacy code, and any bugs we have related to strict aliasing should be fixed. We should remove the flag, though I doubt it has any performance impact. I don't know who first added the flag, or why. It has been there prior to incubation.
            ecn Eric C. Newton added a comment -

            An even better explanation .

            ecn Eric C. Newton added a comment - An even better explanation .

            Okay. Re-closing this. It seems like it should be fine for us to not exclude this optimization, based on the following assumptions:

            1. Our native code is small, depending mostly on STL stuffs, and we don't do crazy stuff that risks these sorts of bugs.
            2. Risk of buggy code is small, because the code paths are thoroughly exercised and we'd catch them in testing.
            3. If this optimization does cause problems due to buggy code, we'd rather fix the bug rather than avoid the optimization.
            ctubbsii Christopher Tubbs added a comment - Okay. Re-closing this. It seems like it should be fine for us to not exclude this optimization, based on the following assumptions: Our native code is small, depending mostly on STL stuffs, and we don't do crazy stuff that risks these sorts of bugs. Risk of buggy code is small, because the code paths are thoroughly exercised and we'd catch them in testing. If this optimization does cause problems due to buggy code, we'd rather fix the bug rather than avoid the optimization.

            People

              ctubbsii Christopher Tubbs
              ctubbsii Christopher Tubbs
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: