Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 3.3.0
-
None
-
None
-
ghx-label-8
Description
Several Thrift structures act like unions but use memory for all fields. For example, TExprNode is a struct with many mutually exclusive fields:
... // The function to execute. Not set for SlotRefs and Literals. 5: optional Types.TFunction fn // If set, child[vararg_start_idx] is the first vararg child. 6: optional i32 vararg_start_idx 7: optional TBoolLiteral bool_literal 8: optional TCaseExpr case_expr 9: optional TDateLiteral date_literal 10: optional TFloatLiteral float_literal 11: optional TIntLiteral int_literal 12: optional TInPredicate in_predicate 13: optional TIsNullPredicate is_null_pred 14: optional TLiteralPredicate literal_pred 15: optional TSlotRef slot_ref 16: optional TStringLiteral string_literal 17: optional TTupleIsNullPredicate tuple_is_null_pred 18: optional TDecimalLiteral decimal_literal 19: optional TAggregateExpr agg_expr 20: optional TTimestampLiteral timestamp_literal 21: optional TKuduPartitionExpr kudu_partition_expr
This inflates the size of the structure when it is not encoded. In C++, TExprNode is 720 bytes based on this simple test:
TEST(PrintSizeTest, TExpr) {
impala::TExprNode expr;
LOG(INFO) << "Sizeof(TExprNode) = " << sizeof(expr);
}
There should be able to reduce this considerably, and with some close attention it may be able to be reduced 10x or more.
TExprNode is notable because it can be used many times (e.g. for complicated expressions or for large numbers of partitions). However, this may be true for other structs.
Attachments
Issue Links
- is related to
-
IMPALA-9477 Only add required partitions to TDescriptorTable for hdfs table sinks
- Open