Uploaded image for project: 'Tajo (Retired)'
  1. Tajo (Retired)
  2. TAJO-1950

Query master uses too much memory during range shuffle

    XMLWordPrintableJSON

Details

    Description

      I ran a simple sort query on a 8TB table as follows.

      tpch10tb> select * from lineitem order by l_orderkey;
      

      After the first stage is completed, query master divides the range of the sort key (l_orderkey) into multiple partitions for range shuffle. Here, the partitioning time took about 9 minutes.

      Here is the log.

      ...
      2015-10-26 14:23:10,782 INFO org.apache.tajo.engine.planner.global.ParallelExecutionQueue: Next executable block eb_1445835438802_0004_000002
      2015-10-26 14:23:10,782 INFO org.apache.tajo.querymaster.Query: Scheduling Stage:eb_1445835438802_0004_000002
      2015-10-26 14:23:10,796 INFO org.apache.tajo.querymaster.Stage: org.apache.tajo.querymaster.DefaultTaskScheduler is chosen for the task scheduling for eb_1445835438802_0004_000002
      2015-10-26 14:23:10,796 INFO org.apache.tajo.querymaster.Stage: eb_1445835438802_0004_000002, Table's volume is approximately 663647 MB
      2015-10-26 14:23:10,796 INFO org.apache.tajo.querymaster.Stage: eb_1445835438802_0004_000002, The determined number of non-leaf tasks is 10370
      2015-10-26 14:23:10,816 INFO org.apache.tajo.querymaster.Repartitioner: eb_1445835438802_0004_000002, Try to divide [(6000000000), (1)) into 10370 sub ranges (total units: 10370)
      2015-10-26 14:24:58,996 INFO org.apache.tajo.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 2440ms
      GC pool 'PS MarkSweep' had collection(s): count=1 time=2214ms
      GC pool 'PS Scavenge' had collection(s): count=1 time=622ms
      2015-10-26 14:27:24,040 WARN org.apache.tajo.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 13237ms
      GC pool 'PS MarkSweep' had collection(s): count=1 time=12635ms
      GC pool 'PS Scavenge' had collection(s): count=1 time=674ms
      2015-10-26 14:28:51,914 WARN org.apache.tajo.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 20873ms
      GC pool 'PS MarkSweep' had collection(s): count=1 time=20486ms
      GC pool 'PS Scavenge' had collection(s): count=1 time=644ms
      2015-10-26 14:30:52,392 WARN org.apache.tajo.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 30986ms
      GC pool 'PS MarkSweep' had collection(s): count=1 time=30546ms
      GC pool 'PS Scavenge' had collection(s): count=1 time=696ms
      2015-10-26 14:32:07,550 WARN org.apache.tajo.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 15449ms
      GC pool 'PS MarkSweep' had collection(s): count=1 time=14593ms
      GC pool 'PS Scavenge' had collection(s): count=1 time=1148ms
      2015-10-26 14:32:15,807 INFO org.apache.tajo.querymaster.Stage: 10370 objects are scheduled
      ...
      

      Attachments

        1. TAJO-1950proposal.pdf
          141 kB
          Jihoon Son

        Issue Links

          Activity

            People

              jihoonson Jihoon Son
              jihoonson Jihoon Son
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: