ExprNodeColumnEvaluator.evaluate() is very heavily used function. For most queries, it is called multiple time per row. In the group-by query in the benchmark, it is even 10 times per row.
This function call sometimes takes 17%-20% CPU time. Usually ExprNodeColumnEvaluator.evaluate() itself takes 2%-3%, UnionStructObjectInspector.getStructFieldData() itself takes 2%-3%, ColumnarStruct.uncheckedGetField() itself takes 3%.
It's hard to come up with a general solution that reduce the costs in a structual way. I tried to did several small code rewriting and hope we can get slight improvements:
1. nullSequence is not passed in for every call but from constructor
2. Restructure ColumnarStruct a little bit.
3. In ExprNodeColumnEvaluator, makes the single level special case, which in most of the time is the common case when referring a column.
When trying to optimize functions which already only take 3%, it's hard to verify the performance enhancement since experiments anyway have slight variation eveyr time.
For 1 and 2, I think they anyway make code better readable. I ran many times, and consistently see about 1% improvement too.
3 might make code less readable, but I see about 5% improvement from some simple group-by query.