Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-561

PhysicalPlanner::createBestSortPlan should consider input size of leaf tasks

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: Physical Operator
    • Labels:
      None

      Description

      Please take a look at the following code. This code determines which sort operator is chosen according to the input volume. Here are two problems. One is threshold is constant value, and it must be configurable. The second problem is that estimateSizeRecursive does not obtain an input volume if a task is leaf. We should fix them. In addition, I think that estimateSizeRecursive should be renamed to more proper name. It's vague and does not follow our naming convention.

      public SortExec createBestSortPlan(TaskAttemptContext context, SortNode sortNode,
                                           PhysicalExec child) throws IOException {
          String [] outerLineage = PlannerUtil.getRelationLineage(sortNode.getChild());
          long estimatedSize = estimateSizeRecursive(context, outerLineage);
          final long threshold = 1048576 * 2000;
      
          // if the relation size is less than the reshold,
          // the in-memory sort will be used.
          if (estimatedSize <= threshold) {
            return new MemSortExec(context, sortNode, child);
          } else {
            return new ExternalSortExec(context, sm, sortNode, child);
          }
        }
      

        Activity

        Hide
        hyunsik Hyunsik Choi added a comment -

        After TAJO-36 and TAJO-584, ExternalSortExec is always the best executor for sort because it works in an in-memory algorithm until the input tuples execeeds the memory.

        Show
        hyunsik Hyunsik Choi added a comment - After TAJO-36 and TAJO-584 , ExternalSortExec is always the best executor for sort because it works in an in-memory algorithm until the input tuples execeeds the memory.

          People

          • Assignee:
            Unassigned
            Reporter:
            hyunsik Hyunsik Choi
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development