A preliminary patch that includes all new classes. They are not integrated with GroupByOperator yet but the integration work is pretty straight-forward.
GenericUDAF is more complex than I thought at first. So I've created a bunch of classes for it:
1, GenericUDAFResolver: takes a function name and the list of parameter TypeInfo and returns a GenericUDAFEvaluator.
2. GenericUDAFEvaluator: allows 2 things:
2.1 Create a new aggregation result buffer
2.2 Update an aggregation result buffer, or terminate the aggregation and get the results.
3. The aggregation result buffer in step 2 is an interface. Each GenericUDAFEvaluator should have its own aggregation result buffer class to store the data (for example, a count for count(), a count and a sum for average()).
1 is used at compile time. 2 and 3 are at runtime.
The reason that I split 2 and 3 is:
A. It shrinks the size of the aggregation result buffer size - only a "long" is needed for count. (input's ObjectInspector and output writable Object (e.g. Long or LongWritable of count()) are both stored in GenericUDAFEvaluator).
B. It makes it easier to move to HIVE-535: A3 in the future.