Description
SparkCollector is used to collect the rows generated by HiveMapFunction or HiveReduceFunction. It currently is backed by a ArrayList, and thus has unbounded memory usage. Ideally, the collector should have a bounded memory usage, and be able to spill to disc when its quota is reached.
Attachments
Attachments
Issue Links
- incorporates
-
HIVE-7652 Check OutputCollector after closing ExecMapper/ExecReducer
- Resolved