Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-8141

[C++] Optimize BM_PlainDecodingBoolean performance using AVX512 Intrinsics API

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 0.17.0
    • C++

    Description

      We are running benchmark on the arrow avx512 build, perf show unpack1_32 as the major hotspot for BM_PlainDecodingBoolean indicator.

      Implement this func with Intrinsics code show big improvements. See below the results on CLX 8280 cpu which is capable of AVX512.

      Indictor default sse build avx512 build avx512 build + Intrinsics Intrinsics improvements
      BM_PlainDecodingBoolean/1024(G/s) 1.55394 3.77701 5.02805 1.331224964
      BM_PlainDecodingBoolean/4096(G/s) 1.83472 5.3826 8.3443 1.550235945
      BM_PlainDecodingBoolean/32768(G/s) 2.00957 6.1258 10.3793 1.694358288
      BM_PlainDecodingBoolean/65536(G/s) 2.02249 6.20035 10.5778 1.706000468

      Attachments

        Issue Links

          Activity

            People

              frank.du Frank Du
              frank.du Frank Du
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m