Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.10.0
-
None
-
Reviewed
Description
There's a call to HiveConf.getBoolVar in FileSinkOperator's processOp method. In benchmarks we found this call to be using ~2% of the CPU time on simple queries, e.g. INSERT OVERWRITE TABLE t1 SELECT * FROM t2;
This boolean value, a flag to collect the RawDataSize stat, won't change during the processing of a query, so we can determine it at initialization and store that value, saving that CPU.
Attachments
Attachments
Issue Links
- is depended upon by
-
HIVE-3710 HiveConf.ConfVars.HIVE_STATS_COLLECT_RAWDATASIZE should not be checked in FileSinkOperator
- Closed