Author: Alex Behm <firstname.lastname@example.org>
Date: Tue Dec 22 13:56:32 2015 -0800
IMPALA-2789: More compact mem layout with null bits at the end.
There are two motivations for this change:
1. Reduce memory consumption.
2. Pave the way for full memory layout compatibility between
Impala and Kudu to eventually enable zero-copy scans. This
patch is a only first step towards that goal.
New Memory Layout
Slots are placed in descending order by size with trailing bytes to
store null flags. Null flags are omitted for non-nullable slots. There
is no padding between tuples when stored back-to-back in a row batch.
Example: select bool_col, int_col, string_col, smallint_col
Offsets: 0 16 20 22 23
The main change is to move the null indicators to the end of tuples.
The new memory layout is fully packed with no padding in between
slots or tuples.
Our standard cluster perf tests showed no significant difference in
query response times as well as consumed cycles, and a slight
reduction in peak memory consumption.
An exhaustive test run passed. Ran a few select tests like TPC-H/DS
with ASAN locally.
These follow-on changes are planned:
1. Planner needs to mark slots non-nullable if they correspond
to a non-nullable Kudu column.
2. Update Kudu scan node to copy tuples with memcpy.
3. Kudu client needs to support transferring ownership of the
tuple memory (maybe do direct and indirect buffers separately).
4. Update Kudu scan node to use memory transfer instead of copy
Reviewed-by: Alex Behm <email@example.com>
Tested-by: Internal Jenkins