Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:
      None

      Description

      To speed up parquet decoding we need fast bit unpacking functions, e.g. along the lines of https://github.com/lemire/FrameOfReference

      Our current BitReader unpacks a value at a time and has a lot of unnecessary branches and other calculations.

        Activity

        Hide
        tarmstrong Tim Armstrong added a comment -

        -----------
        IMPALA-4123: Fast bit unpacking

        Adds utility functions for fast unpacking of batches of bit-packed
        values. These support reading batches of any number of values provided
        that the start of the batch is aligned to a byte boundary. Callers that
        want to read smaller batches that don't align to byte boundaries will
        need to implement their own buffering.

        The unpacking code uses only portable C++ and no SIMD intrinsics, but is
        fairly efficient because unpacking a full batch of 32 values compiles
        down to 32-bit loads, shifts by constants, masks by constants, bitwise
        ors when a value straddles 32-bit words and stores. Further speedups
        should be possible using SIMD intrinsics.

        Testing:
        Added unit tests for unpacking, exhaustively covering different
        bitwidths with additional test dimensions (memory alignment, various
        input sizes, etc).

        Tested under ASAN to ensure the bit unpacking doesn't read past the end
        of buffers.

        Perf:
        Added microbenchmark that shows on average an 8-9x speedup over the
        existing BitReader code.

        Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
        Reviewed-on: http://gerrit.cloudera.org:8080/4494
        Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        tarmstrong Tim Armstrong added a comment - ----------- IMPALA-4123 : Fast bit unpacking Adds utility functions for fast unpacking of batches of bit-packed values. These support reading batches of any number of values provided that the start of the batch is aligned to a byte boundary. Callers that want to read smaller batches that don't align to byte boundaries will need to implement their own buffering. The unpacking code uses only portable C++ and no SIMD intrinsics, but is fairly efficient because unpacking a full batch of 32 values compiles down to 32-bit loads, shifts by constants, masks by constants, bitwise ors when a value straddles 32-bit words and stores. Further speedups should be possible using SIMD intrinsics. Testing: Added unit tests for unpacking, exhaustively covering different bitwidths with additional test dimensions (memory alignment, various input sizes, etc). Tested under ASAN to ensure the bit unpacking doesn't read past the end of buffers. Perf: Added microbenchmark that shows on average an 8-9x speedup over the existing BitReader code. Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622 Reviewed-on: http://gerrit.cloudera.org:8080/4494 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins
        Hide
        jbapple Jim Apple added a comment -

        Might it be premature to mark this resolved before the new faster code is actually used when running queries?

        Show
        jbapple Jim Apple added a comment - Might it be premature to mark this resolved before the new faster code is actually used when running queries?
        Hide
        tarmstrong Tim Armstrong added a comment -

        This task was just to add the utility functions, the parent task is tracking the rest of the work

        Show
        tarmstrong Tim Armstrong added a comment - This task was just to add the utility functions, the parent task is tracking the rest of the work

          People

          • Assignee:
            tarmstrong Tim Armstrong
            Reporter:
            tarmstrong Tim Armstrong
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development