Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
None
-
None
-
None
Description
Columns in Trafodion tables are stored in 2 formats:
– regular hbase format where each column is stored as one cell
– aligned format where the whole row is packed and stored in one cell
Aligned row provides performance boost during inserts and selects by
retrieving one cell from hbase instead of multiple cells. As the number
of columns in a table increase, perf of aligned format gets better.
There are some limitations with aligned format:
– selection predicates cannot be pushed down to hbase region server
– all columns need to be retrieved and updated as packed row in a cell
– columns cannot be dropped without reloading the table
Over time, these limitations will be removed by use of user defined filters
and coprocessors to select/project rows at hbase region level.
During perf runs, the pros for aligned format outweigh the cons.
This jira is being filed to change the default from hbase format row
to aligned format row. Code for both aligned and hbase format already
exists and is being used.
A table can always be created in either of these 2 formats by explicitly
specifying the format during create time.
The default can also be changed to off or on by inserting the appropriate
value in the system defaults table.
Turning on aligned format as default will be done in 2 phases:
– in phase 1, aligned default will be turning on during dev regressions run
until it has stabilized.
– in phase 2, system default will be changed to aligned. All table created
without an explicit format specification will be created in aligned format.
Metadata, repository, privilege and histogram tables will always be
created in hbase format. This is needed for backward compatibility.
Any component or application that doesn't want to depend on the system
default must explicitly specify the row format in their create ddl.