Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None

      Description

      Two days ago, I ran a 300X TPCH Q1 on a 7 slaves , 1 master tajo clsuter.
      Here is my configuration for each worker jvm

      export TAJO_WORKER_OPTS="-Xmx20g -Xms20g -XX:MaxPermSize=512m -verbose:gc -Xloggc:$TAJO_LOG_DIR/worker-gc.log -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedOops "
      

      I observed very frequent young gc

      $ jstat -gcutil  `pgrep -f TajoWorker` 1000
       S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT  
       0.00   2.13  57.40   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.40   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.41   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.42   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.42   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.43   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.43   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.43   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.43   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.43   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.44   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.53   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.57   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.58   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.58   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.58   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.59   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.59   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.60   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  57.60   3.79  99.65   5185   75.042     0    0.000   75.042
       0.00   2.13  66.73   3.79  99.66   5185   75.042     0    0.000   75.042
       0.00  66.46  17.89   3.79  99.66   5190   75.082     0    0.000   75.082
       8.79   0.00  37.36   3.88  99.66   5202   75.533     0    0.000   75.533
       0.00   9.10  21.61   3.88  99.66   5216   75.926     0    0.000   75.926
      ...                                                                                               105.523
      

      After finish the Q1 query, GCT is at 105.523, which means young gc cost 30 seconds. As well known, ParNew YGC is a kind of stop the world gc. The whole query need about 98 secs. So in this case, young gc waste nearly 1 third of the query time.

      1. TAJO-548.patch
        20 kB
        Hyunsik Choi
      2. TAJO-548-v2.patch
        14 kB
        Min Zhou

        Activity

        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #62 (See https://builds.apache.org/job/Tajo-master-build/62/)
        TAJO-548: Investigate frequent young gc. (Min Zhou via hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=9de88cb9940e2f921d2814bac691f2227c5b963a)

        • tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithGroupby.result
        • tajo-common/src/test/java/org/apache/tajo/util/TestBytes.java
        • tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithGroupby.sql
        • tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithUnion.sql
        • tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithOrderby.result
        • tajo-storage/src/main/java/org/apache/tajo/storage/TextSerializerDeserializer.java
        • tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithLimit.result
        • tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/testCtasWithColumnedPartition.sql
        • CHANGES.txt
        • tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithOrderby.sql
        • tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithLimit.sql
        • tajo-common/src/main/java/org/apache/tajo/util/Bytes.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #62 (See https://builds.apache.org/job/Tajo-master-build/62/ ) TAJO-548 : Investigate frequent young gc. (Min Zhou via hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=9de88cb9940e2f921d2814bac691f2227c5b963a ) tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithGroupby.result tajo-common/src/test/java/org/apache/tajo/util/TestBytes.java tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithGroupby.sql tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithUnion.sql tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithOrderby.result tajo-storage/src/main/java/org/apache/tajo/storage/TextSerializerDeserializer.java tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithLimit.result tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/testCtasWithColumnedPartition.sql CHANGES.txt tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithOrderby.sql tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithLimit.sql tajo-common/src/main/java/org/apache/tajo/util/Bytes.java
        Hide
        hyunsik Hyunsik Choi added a comment -

        This issue got +1 on RB. I've just committed it to master branch.

        Thank you Min for your contribution.

        Thank you Jihoon for the review.

        Show
        hyunsik Hyunsik Choi added a comment - This issue got +1 on RB. I've just committed it to master branch. Thank you Min for your contribution. Thank you Jihoon for the review.
        Hide
        hyunsik Hyunsik Choi added a comment -

        I'm very sorry for the late review. I uploaded the rebased patch.

        I checked the source code. As Min mentioned, that source code seems to be derived from Bytes.parseInt used in Hive. So, I believe that it does not license problem.

        The solution definitely reduces the creation of many objects. We have wanted this approach. I throw +1 for this patch.

        Aside from this patch, if some source code is derived from JDK (GPL v2 with classpath exception), the source cannot be included in Apache License Software as far as I know. See the this page (http://www.apache.org/legal/resolved.html#category-x).

        Thank you for your contribution!

        Show
        hyunsik Hyunsik Choi added a comment - I'm very sorry for the late review. I uploaded the rebased patch. I checked the source code. As Min mentioned, that source code seems to be derived from Bytes.parseInt used in Hive. So, I believe that it does not license problem. The solution definitely reduces the creation of many objects. We have wanted this approach. I throw +1 for this patch. Aside from this patch, if some source code is derived from JDK (GPL v2 with classpath exception), the source cannot be included in Apache License Software as far as I know. See the this page ( http://www.apache.org/legal/resolved.html#category-x ). Thank you for your contribution!
        Hide
        hyunsik Hyunsik Choi added a comment -

        Created a review request against branch master in reviewboard
        https://reviews.apache.org/r/17852/

        Show
        hyunsik Hyunsik Choi added a comment - Created a review request against branch master in reviewboard https://reviews.apache.org/r/17852/
        Hide
        coderplay Min Zhou added a comment -

        Never mind, take your time, Hyunsik

        Show
        coderplay Min Zhou added a comment - Never mind, take your time, Hyunsik
        Hide
        hyunsik Hyunsik Choi added a comment -

        I'm sorry for the late. I'll check it today.

        Show
        hyunsik Hyunsik Choi added a comment - I'm sorry for the late. I'll check it today.
        Hide
        coderplay Min Zhou added a comment -

        There is no license problem here. JDK is under a GPLv2 license with classpath exception.

        see http://www.gnu.org/software/classpath/license.html
        see the licensing section on http://openjdk.java.net/faq/
        see http://www.oracle.com/technetwork/java/javase/downloads/license-nb7-jdk6u25-bundle-362993.txt

        Actually, the source code of Bytes.parseInt is also used in hive.
        see https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyInteger.java

        So, no worries about the license.

        Show
        coderplay Min Zhou added a comment - There is no license problem here. JDK is under a GPLv2 license with classpath exception. see http://www.gnu.org/software/classpath/license.html see the licensing section on http://openjdk.java.net/faq/ see http://www.oracle.com/technetwork/java/javase/downloads/license-nb7-jdk6u25-bundle-362993.txt Actually, the source code of Bytes.parseInt is also used in hive. see https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyInteger.java So, no worries about the license.
        Hide
        jhkim Jinho Kim added a comment -

        Currently, the number converter is overhead of the text deserializer
        This is a useless code. : valueOf(new String())
        Hyunsik and me had discussed. but we found jdk license problem.

        Min Zhou
        Is not to include a license problem in patch ?
        Because parseInt is similar to jdk source.

        Show
        jhkim Jinho Kim added a comment - Currently, the number converter is overhead of the text deserializer This is a useless code. : valueOf(new String()) Hyunsik and me had discussed. but we found jdk license problem. Min Zhou Is not to include a license problem in patch ? Because parseInt is similar to jdk source.
        Hide
        jihoonson Jihoon Son added a comment -

        Awesome!

        Show
        jihoonson Jihoon Son added a comment - Awesome!
        Hide
        hyunsik Hyunsik Choi added a comment -

        Great job!

        I'll also test it on a real cluster, and than I'll share the result soon.

        Show
        hyunsik Hyunsik Choi added a comment - Great job! I'll also test it on a real cluster, and than I'll share the result soon.
        Hide
        coderplay Min Zhou added a comment -

        Add a test case for parsing integer from byte array.

        I will deploy a newer version of JDK on one of the worker machine, and use better tool to make a further investigation on young gc issue.

        Show
        coderplay Min Zhou added a comment - Add a test case for parsing integer from byte array. I will deploy a newer version of JDK on one of the worker machine, and use better tool to make a further investigation on young gc issue.
        Hide
        coderplay Min Zhou added a comment -

        With this patch, 300X TPCH Q1 run faster than before, usually about 80 secs. And the young gc time was successfully reduced , from 30 secs drop to 11 secs.

        Show
        coderplay Min Zhou added a comment - With this patch, 300X TPCH Q1 run faster than before, usually about 80 secs. And the young gc time was successfully reduced , from 30 secs drop to 11 secs.
        Hide
        coderplay Min Zhou added a comment -

        For reducing frequent young gc, we have 2 choices.
        One is to tune the young generation heap size and other related options , the other is to figure out which method create so many temporary objects.

        After some investigation, I found 2 suspicions

        One is

          at java.lang.String.<init>(String.java:570)
                at org.apache.tajo.storage.TextSerializerDeserializer.deserialize(TextSerializerDeserializer.java:139)
                at org.apache.tajo.storage.LazyTuple.get(LazyTuple.java:123)
                at org.apache.tajo.engine.eval.FieldEval.eval(FieldEval.java:51)
                at org.apache.tajo.engine.eval.BinaryEval.eval(BinaryEval.java:140)
                at org.apache.tajo.engine.eval.BinaryEval.eval(BinaryEval.java:140)
                at org.apache.tajo.engine.planner.Projector.eval(Projector.java:48)
                at org.apache.tajo.engine.planner.physical.SeqScanExec.next(SeqScanExec.java:186)
                at org.apache.tajo.engine.planner.physical.HashAggregateExec.compute(HashAggregateExec.java:57)
                at org.apache.tajo.engine.planner.physical.HashAggregateExec.next(HashAggregateExec.java:83)
                at org.apache.tajo.engine.planner.physical.HashShuffleFileWriteExec.next(HashShuffleFileWriteExec.java:117)
                at org.apache.tajo.worker.Task.run(Task.java:370)
                at org.apache.tajo.worker.TaskRunner$1.run(TaskRunner.java:392)
                at java.lang.Thread.run(Thread.java:662)
        

        The other is

         at java.util.ArrayList.<init>(ArrayList.java:119)
                at org.apache.tajo.util.Bytes.splitWorker(Bytes.java:1490)
                at org.apache.tajo.util.Bytes.splitPreserveAllTokens(Bytes.java:1451)
                at org.apache.tajo.storage.CSVFile$CSVScanner.next(CSVFile.java:429)
                at org.apache.tajo.engine.planner.physical.SeqScanExec.next(SeqScanExec.java:183)
                at org.apache.tajo.engine.planner.physical.HashAggregateExec.compute(HashAggregateExec.java:57)
                at org.apache.tajo.engine.planner.physical.HashAggregateExec.next(HashAggregateExec.java:83)
                at org.apache.tajo.engine.planner.physical.HashShuffleFileWriteExec.next(HashShuffleFileWriteExec.java:117)
                at org.apache.tajo.worker.Task.run(Task.java:370)
                at org.apache.tajo.worker.TaskRunner$1.run(TaskRunner.java:392)
                at java.lang.Thread.run(Thread.java:662)
        

        The first one is easy to improve. That line parse an byte array, convert it into a double value. The original way is firstly convert the byte array to a String, which involved decoding from byte array to char array and creating char array and string instance. After that, using Double.valueOf(String) to achieve the conversion purpose. Actually, we can directly parse a byte array into double , without intermediary conversion, and without allocating memory for intermediary char array and string objects.

        Show
        coderplay Min Zhou added a comment - For reducing frequent young gc, we have 2 choices. One is to tune the young generation heap size and other related options , the other is to figure out which method create so many temporary objects. After some investigation, I found 2 suspicions One is at java.lang.String.<init>(String.java:570) at org.apache.tajo.storage.TextSerializerDeserializer.deserialize(TextSerializerDeserializer.java:139) at org.apache.tajo.storage.LazyTuple.get(LazyTuple.java:123) at org.apache.tajo.engine.eval.FieldEval.eval(FieldEval.java:51) at org.apache.tajo.engine.eval.BinaryEval.eval(BinaryEval.java:140) at org.apache.tajo.engine.eval.BinaryEval.eval(BinaryEval.java:140) at org.apache.tajo.engine.planner.Projector.eval(Projector.java:48) at org.apache.tajo.engine.planner.physical.SeqScanExec.next(SeqScanExec.java:186) at org.apache.tajo.engine.planner.physical.HashAggregateExec.compute(HashAggregateExec.java:57) at org.apache.tajo.engine.planner.physical.HashAggregateExec.next(HashAggregateExec.java:83) at org.apache.tajo.engine.planner.physical.HashShuffleFileWriteExec.next(HashShuffleFileWriteExec.java:117) at org.apache.tajo.worker.Task.run(Task.java:370) at org.apache.tajo.worker.TaskRunner$1.run(TaskRunner.java:392) at java.lang.Thread.run(Thread.java:662) The other is at java.util.ArrayList.<init>(ArrayList.java:119) at org.apache.tajo.util.Bytes.splitWorker(Bytes.java:1490) at org.apache.tajo.util.Bytes.splitPreserveAllTokens(Bytes.java:1451) at org.apache.tajo.storage.CSVFile$CSVScanner.next(CSVFile.java:429) at org.apache.tajo.engine.planner.physical.SeqScanExec.next(SeqScanExec.java:183) at org.apache.tajo.engine.planner.physical.HashAggregateExec.compute(HashAggregateExec.java:57) at org.apache.tajo.engine.planner.physical.HashAggregateExec.next(HashAggregateExec.java:83) at org.apache.tajo.engine.planner.physical.HashShuffleFileWriteExec.next(HashShuffleFileWriteExec.java:117) at org.apache.tajo.worker.Task.run(Task.java:370) at org.apache.tajo.worker.TaskRunner$1.run(TaskRunner.java:392) at java.lang.Thread.run(Thread.java:662) The first one is easy to improve. That line parse an byte array, convert it into a double value. The original way is firstly convert the byte array to a String, which involved decoding from byte array to char array and creating char array and string instance. After that, using Double.valueOf(String) to achieve the conversion purpose. Actually, we can directly parse a byte array into double , without intermediary conversion, and without allocating memory for intermediary char array and string objects.

          People

          • Assignee:
            coderplay Min Zhou
            Reporter:
            coderplay Min Zhou
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development