Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
ghx-label-3
Description
I see this Kudu page [1] states that the "default" encoding for int8, etc
are "bit shuffle"
and this Impala page [2] states that AUTO_ENCODING: use the default encoding based on the column type; currently always the same as PLAIN_ENCODING, but subject to change in the future.
I dug into some Kudu commands to try to view the actual data, and it seems that BIT_SHUFFLE is the correct answer for integer fields with no specific column encoding specified.
[1] http://kudu.apache.org/docs/schema_design.html#encoding
[2] https://impala.apache.org/docs/build/html/topics/impala_kudu.html