Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1983

Improve memory usage of ExternalSortExec

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0, 0.10.0, 0.11.0
    • Fix Version/s: 0.12.0, 0.11.1
    • Component/s: Offheap, Physical Operator
    • Labels:
      None

      Description

      ExternalSortExec keeps tuple list for sort. but it causes too many GC.
      We should change to off-heap tuple instead of VTuple

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user jinossy opened a pull request:

        https://github.com/apache/tajo/pull/869

        TAJO-1983: Improve memory usage of ExternalSortExec.

        I've add UnSafeTupleList. it uses off-heap memory.
        A last chunk are not stored in local file and the chunks are merged when the next method is called.
        In addition, I’ve change the insignificant log level to debug.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/jinossy/tajo TAJO-1983

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/869.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #869



        Show
        githubbot ASF GitHub Bot added a comment - GitHub user jinossy opened a pull request: https://github.com/apache/tajo/pull/869 TAJO-1983 : Improve memory usage of ExternalSortExec. I've add UnSafeTupleList. it uses off-heap memory. A last chunk are not stored in local file and the chunks are merged when the next method is called. In addition, I’ve change the insignificant log level to debug. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jinossy/tajo TAJO-1983 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/869.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #869
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on the pull request:

        https://github.com/apache/tajo/pull/869#issuecomment-159139295

        This PR is ready to review.
        Thanks

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/869#issuecomment-159139295 This PR is ready to review. Thanks
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/869#discussion_r45935202

        — Diff: tajo-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java —
        @@ -403,12 +400,20 @@ public void decrementRemainFiles(FileRegion filePart, long fileStartTime)

        { minTime = fileSendTime; }

        + if (fileSendTime > 20 * 1000) {
        + LOG.warn("PullServer send too long time: filePos:" + filePart.position()
        — End diff –

        I know you just moved this code. But, the original logging message needs to be corrected. Could you update the log message?

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/869#discussion_r45935202 — Diff: tajo-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java — @@ -403,12 +400,20 @@ public void decrementRemainFiles(FileRegion filePart, long fileStartTime) { minTime = fileSendTime; } + if (fileSendTime > 20 * 1000) { + LOG.warn("PullServer send too long time: filePos:" + filePart.position() — End diff – I know you just moved this code. But, the original logging message needs to be corrected. Could you update the log message?
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/869#discussion_r45936295

        — Diff: tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java —
        @@ -103,7 +117,7 @@
        /** the final result */
        — End diff –

        The member variable ``memoryResident`` seems to be not necessary anymore.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on a diff in the pull request: https://github.com/apache/tajo/pull/869#discussion_r45936295 — Diff: tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java — @@ -103,7 +117,7 @@ /** the final result */ — End diff – The member variable ``memoryResident`` seems to be not necessary anymore.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/869#issuecomment-159773488

        The patch looks great to me. Here is my +1. I leaved some trivial comments.You can commit the patch if you reflect my comments.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/869#issuecomment-159773488 The patch looks great to me. Here is my +1. I leaved some trivial comments.You can commit the patch if you reflect my comments.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on the pull request:

        https://github.com/apache/tajo/pull/869#issuecomment-159787856

        I've update the patch that reflects your comments and I will commit soon
        Thanks for your review!

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/869#issuecomment-159787856 I've update the patch that reflects your comments and I will commit soon Thanks for your review!
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/869

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/869
        Hide
        jhkim Jinho Kim added a comment -

        committed it
        Thanks

        Show
        jhkim Jinho Kim added a comment - committed it Thanks
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #603 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/603/)
        TAJO-1983: Improve memory usage of ExternalSortExec. (jhkim: rev 550c0189b9ac06f20d4ef70feb7ef743c0b06d0f)

        • tajo-core/src/main/java/org/apache/tajo/worker/ExecutionBlockContext.java
        • tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/NullScanner.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/ResizableMemoryBlock.java
        • tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/BaseTupleComparator.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalExec.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestProgressExternalSortExec.java
        • tajo-core-tests/src/test/java/org/apache/tajo/querymaster/TestTaskStatusUpdate.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/UnSafeTupleList.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockUtils.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockWriter.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowWriter.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestUnSafeTuple.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/BaseTupleBuilder.java
        • tajo-common/src/test/java/org/apache/tajo/tuple/memory/TestMemoryRowBlock.java
        • tajo-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java
        • tajo-core/src/main/java/org/apache/tajo/worker/TaskImpl.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/CompactRowBlockWriter.java
        • tajo-core/src/main/java/org/apache/tajo/worker/Fetcher.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/RowWriter.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
        • CHANGES
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #603 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/603/ ) TAJO-1983 : Improve memory usage of ExternalSortExec. (jhkim: rev 550c0189b9ac06f20d4ef70feb7ef743c0b06d0f) tajo-core/src/main/java/org/apache/tajo/worker/ExecutionBlockContext.java tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/NullScanner.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/ResizableMemoryBlock.java tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/BaseTupleComparator.java tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalExec.java tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestProgressExternalSortExec.java tajo-core-tests/src/test/java/org/apache/tajo/querymaster/TestTaskStatusUpdate.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/UnSafeTupleList.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockUtils.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockWriter.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowWriter.java tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestUnSafeTuple.java tajo-common/src/main/java/org/apache/tajo/tuple/BaseTupleBuilder.java tajo-common/src/test/java/org/apache/tajo/tuple/memory/TestMemoryRowBlock.java tajo-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java tajo-core/src/main/java/org/apache/tajo/worker/TaskImpl.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/CompactRowBlockWriter.java tajo-core/src/main/java/org/apache/tajo/worker/Fetcher.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/RowWriter.java tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java CHANGES
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #990 (See https://builds.apache.org/job/Tajo-master-build/990/)
        TAJO-1983: Improve memory usage of ExternalSortExec. (jhkim: rev 550c0189b9ac06f20d4ef70feb7ef743c0b06d0f)

        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
        • tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/BaseTupleComparator.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestProgressExternalSortExec.java
        • tajo-core/src/main/java/org/apache/tajo/worker/Fetcher.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowWriter.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockWriter.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestUnSafeTuple.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/UnSafeTupleList.java
        • CHANGES
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/ResizableMemoryBlock.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockUtils.java
        • tajo-common/src/test/java/org/apache/tajo/tuple/memory/TestMemoryRowBlock.java
        • tajo-core-tests/src/test/java/org/apache/tajo/querymaster/TestTaskStatusUpdate.java
        • tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/NullScanner.java
        • tajo-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalExec.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/BaseTupleBuilder.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/RowWriter.java
        • tajo-core/src/main/java/org/apache/tajo/worker/ExecutionBlockContext.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/CompactRowBlockWriter.java
        • tajo-core/src/main/java/org/apache/tajo/worker/TaskImpl.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #990 (See https://builds.apache.org/job/Tajo-master-build/990/ ) TAJO-1983 : Improve memory usage of ExternalSortExec. (jhkim: rev 550c0189b9ac06f20d4ef70feb7ef743c0b06d0f) tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/BaseTupleComparator.java tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestProgressExternalSortExec.java tajo-core/src/main/java/org/apache/tajo/worker/Fetcher.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowWriter.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockWriter.java tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestUnSafeTuple.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/UnSafeTupleList.java CHANGES tajo-common/src/main/java/org/apache/tajo/tuple/memory/ResizableMemoryBlock.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockUtils.java tajo-common/src/test/java/org/apache/tajo/tuple/memory/TestMemoryRowBlock.java tajo-core-tests/src/test/java/org/apache/tajo/querymaster/TestTaskStatusUpdate.java tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/NullScanner.java tajo-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalExec.java tajo-common/src/main/java/org/apache/tajo/tuple/BaseTupleBuilder.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/RowWriter.java tajo-core/src/main/java/org/apache/tajo/worker/ExecutionBlockContext.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/CompactRowBlockWriter.java tajo-core/src/main/java/org/apache/tajo/worker/TaskImpl.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-0.11.1-build #114 (See https://builds.apache.org/job/Tajo-0.11.1-build/114/)
        TAJO-1983: Improve memory usage of ExternalSortExec. (jhkim: rev a28db5a70e3d694ad6f81da87ed6336aaa6ea6a0)

        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/ResizableMemoryBlock.java
        • tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/NullScanner.java
        • tajo-core/src/main/java/org/apache/tajo/worker/ExecutionBlockContext.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowWriter.java
        • tajo-core/src/main/java/org/apache/tajo/worker/TaskImpl.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockUtils.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockWriter.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/UnSafeTupleList.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestUnSafeTuple.java
        • CHANGES
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/RowWriter.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalExec.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/BaseTupleBuilder.java
        • tajo-common/src/test/java/org/apache/tajo/tuple/memory/TestMemoryRowBlock.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestProgressExternalSortExec.java
        • tajo-core-tests/src/test/java/org/apache/tajo/querymaster/TestTaskStatusUpdate.java
        • tajo-core/src/main/java/org/apache/tajo/worker/Fetcher.java
        • tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/BaseTupleComparator.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
        • tajo-common/src/main/java/org/apache/tajo/tuple/memory/CompactRowBlockWriter.java
        • tajo-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-0.11.1-build #114 (See https://builds.apache.org/job/Tajo-0.11.1-build/114/ ) TAJO-1983 : Improve memory usage of ExternalSortExec. (jhkim: rev a28db5a70e3d694ad6f81da87ed6336aaa6ea6a0) tajo-common/src/main/java/org/apache/tajo/tuple/memory/ResizableMemoryBlock.java tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/NullScanner.java tajo-core/src/main/java/org/apache/tajo/worker/ExecutionBlockContext.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowWriter.java tajo-core/src/main/java/org/apache/tajo/worker/TaskImpl.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockUtils.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/OffHeapRowBlockWriter.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/UnSafeTupleList.java tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestUnSafeTuple.java CHANGES tajo-common/src/main/java/org/apache/tajo/tuple/memory/RowWriter.java tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalExec.java tajo-common/src/main/java/org/apache/tajo/tuple/BaseTupleBuilder.java tajo-common/src/test/java/org/apache/tajo/tuple/memory/TestMemoryRowBlock.java tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestProgressExternalSortExec.java tajo-core-tests/src/test/java/org/apache/tajo/querymaster/TestTaskStatusUpdate.java tajo-core/src/main/java/org/apache/tajo/worker/Fetcher.java tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/BaseTupleComparator.java tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java tajo-common/src/main/java/org/apache/tajo/tuple/memory/CompactRowBlockWriter.java tajo-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java

          People

          • Assignee:
            jhkim Jinho Kim
            Reporter:
            jhkim Jinho Kim
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development