Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.8.0
-
None
-
None
Description
Two days ago, I ran a 300X TPCH Q1 on a 7 slaves , 1 master tajo clsuter.
Here is my configuration for each worker jvm
export TAJO_WORKER_OPTS="-Xmx20g -Xms20g -XX:MaxPermSize=512m -verbose:gc -Xloggc:$TAJO_LOG_DIR/worker-gc.log -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedOops "
I observed very frequent young gc
$ jstat -gcutil `pgrep -f TajoWorker` 1000 S0 S1 E O P YGC YGCT FGC FGCT GCT 0.00 2.13 57.40 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.40 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.41 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.42 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.42 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.44 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.53 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.57 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.59 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.59 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.60 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 57.60 3.79 99.65 5185 75.042 0 0.000 75.042 0.00 2.13 66.73 3.79 99.66 5185 75.042 0 0.000 75.042 0.00 66.46 17.89 3.79 99.66 5190 75.082 0 0.000 75.082 8.79 0.00 37.36 3.88 99.66 5202 75.533 0 0.000 75.533 0.00 9.10 21.61 3.88 99.66 5216 75.926 0 0.000 75.926 ... 105.523
After finish the Q1 query, GCT is at 105.523, which means young gc cost 30 seconds. As well known, ParNew YGC is a kind of stop the world gc. The whole query need about 98 secs. So in this case, young gc waste nearly 1 third of the query time.