Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.13.1
Description
In Map join Log4JLogger.trace takes 4% of the CPU time as it gets called per row from the probe side by CommonJoinOperator.genAllOneUniqueJoinObject.
Fix is to remove the logging code code below from CommonJoinOperator.genAllOneUniqueJoinObject:
if (allOne) { LOG.info("calling genAllOneUniqueJoinObject"); genAllOneUniqueJoinObject(); LOG.info("called genAllOneUniqueJoinObject"); } else { LOG.trace("calling genUniqueJoinObject"); genUniqueJoinObject(0, 0); LOG.trace("called genUniqueJoinObject"); }
And
if (!hasEmpty && !mayHasMoreThanOne) { LOG.trace("calling genAllOneUniqueJoinObject"); genAllOneUniqueJoinObject(); LOG.trace("called genAllOneUniqueJoinObject"); } else if (!hasEmpty && !hasLeftSemiJoin) { LOG.trace("calling genUniqueJoinObject"); genUniqueJoinObject(0, 0); LOG.trace("called genUniqueJoinObject"); } else { LOG.trace("calling genObject"); genJoinObject(); LOG.trace("called genObject"); }
This is the call stack
Stack Trace Sample Count Percentage(%) hadoop.hive.ql.exec.MapJoinOperator.processOp(Object, int) 388 75.486 hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject() 121 23.541 hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject() 92 17.899 commons.logging.impl.Log4JLogger.trace(Object) 20 3.891 log4j.Category.log(String, Priority, Object, Throwable) 20 3.891 log4j.Category.getEffectiveLevel() 10 1.946