Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: None
    • Labels:
      None

      Description

      Build a merging receiver operator which combines a number of incoming batches into a single outgoing batch.

      Overview

      Incoming batches are individually presorted before reaching this operator, so a priority queue is built with each value to determine which batch contains the next record. When the next value is removed from the priority queue, the underlying record is copied from the underlying ValueVectors to the outgoing VectorContainer.

      The comparator for the priority queue is generated based on the supplied LogicalExpression (e.g. a single ValueVectorReadExpression).

      Example

      The following example illustrates a distributed count operation, where each remote fragment counts a subset of the data and the root fragment produces a sum of each count aggregate.

      Data Flow

      RecordReader
        |
        +-> Sort
             |
             +-> StreamingAggregate(COUNT)
                    |
                    +-> MergingPartitionExchange
                           |
                           +-> StreamingAggregate(SUM)
                                  |
                                  +-> UnionExchange
                                         |
                                         +-> Screen
      

      Control Flow

      Root Fragment
      -------------
      Screen
         |
         +->UnionExchange
               | | |
               | | +->AggSum
               | |      |
               | |      +->MergingReceiver
               | |
               | +--->AggSum
               |        |
               |        +->MergingReceiver
               |
               +----->AggSum
                        |
                        +->MergingReceiver
                   ...
      
      Remote Fragment
      ---------------
      PartitionSender
             |
             +->AggCount
                   |
                   +->Sort
                       |
                       +->Reader
      

        Activity

        Jacques Nadeau created issue -
        Ben Becker made changes -
        Field Original Value New Value
        Assignee Ben Becker [ benjamin.becker ]
        Ben Becker made changes -
        Description Build a merging receiver operator which combines a number of incoming buffers into a single output stream by merging the streams based on equality of one or more expressions Build a merging receiver operator which combines a number of incoming buffers into a single output stream by merging the streams based on equality of one or more expressions.

        The following example illustrates a distributed count operation, where each remote fragment counts a subset of the data and the root fragment produces a sum of each count aggregate.

        h4. Data Flow
        {noformat}
        RecordReader
          |
          +-> Sort
               |
               +-> StreamingAggregate(COUNT)
                      |
                      +-> MergingPartitionExchange
                             |
                             +-> StreamingAggregate(SUM)
                                    |
                                    +-> UnionExchange
                                           |
                                           +-> Screen
        {noformat}

        h4. Control Flow

        {noformat}
        Root Fragment
        -------------
        Screen
           |
           +->UnionExchange
                 | | |
                 | | +->AggSum
                 | | |
                 | | +->MergingReceiver
                 | |
                 | +--->AggSum
                 | |
                 | +->MergingReceiver
                 |
                 +----->AggSum
                          |
                          +->MergingReceiver
                     ...

        Remote Fragment
        ---------------
        PartitionSender
               |
               +->AggCount
                     |
                     +->Sort
                         |
                         +->Reader
        {noformat}
        Hide
        Jacques Nadeau added a comment -

        Nice! Looks right on the money.

        Show
        Jacques Nadeau added a comment - Nice! Looks right on the money.
        Ben Becker made changes -
        Description Build a merging receiver operator which combines a number of incoming buffers into a single output stream by merging the streams based on equality of one or more expressions.

        The following example illustrates a distributed count operation, where each remote fragment counts a subset of the data and the root fragment produces a sum of each count aggregate.

        h4. Data Flow
        {noformat}
        RecordReader
          |
          +-> Sort
               |
               +-> StreamingAggregate(COUNT)
                      |
                      +-> MergingPartitionExchange
                             |
                             +-> StreamingAggregate(SUM)
                                    |
                                    +-> UnionExchange
                                           |
                                           +-> Screen
        {noformat}

        h4. Control Flow

        {noformat}
        Root Fragment
        -------------
        Screen
           |
           +->UnionExchange
                 | | |
                 | | +->AggSum
                 | | |
                 | | +->MergingReceiver
                 | |
                 | +--->AggSum
                 | |
                 | +->MergingReceiver
                 |
                 +----->AggSum
                          |
                          +->MergingReceiver
                     ...

        Remote Fragment
        ---------------
        PartitionSender
               |
               +->AggCount
                     |
                     +->Sort
                         |
                         +->Reader
        {noformat}
        Build a merging receiver operator which combines a number of incoming batches into a single outgoing batch.

        h3. Overview
        Incoming batches are individually presorted before reaching this operator, so a priority queue is built with each value to determine which batch contains the next record. When the next value is removed from the priority queue, the underlying record is copied from the underlying ValueVectors to the outgoing VectorContainer.

        The comparator for the priority queue is generated based on the supplied LogicalExpression (e.g. a single ValueVectorReadExpression).

        h3. Example
        The following example illustrates a distributed count operation, where each remote fragment counts a subset of the data and the root fragment produces a sum of each count aggregate.

        h4. Data Flow
        {noformat}
        RecordReader
          |
          +-> Sort
               |
               +-> StreamingAggregate(COUNT)
                      |
                      +-> MergingPartitionExchange
                             |
                             +-> StreamingAggregate(SUM)
                                    |
                                    +-> UnionExchange
                                           |
                                           +-> Screen
        {noformat}

        h4. Control Flow

        {noformat}
        Root Fragment
        -------------
        Screen
           |
           +->UnionExchange
                 | | |
                 | | +->AggSum
                 | | |
                 | | +->MergingReceiver
                 | |
                 | +--->AggSum
                 | |
                 | +->MergingReceiver
                 |
                 +----->AggSum
                          |
                          +->MergingReceiver
                     ...

        Remote Fragment
        ---------------
        PartitionSender
               |
               +->AggCount
                     |
                     +->Sort
                         |
                         +->Reader
        {noformat}
        Ben Becker made changes -
        Attachment DRILL-229.patch [ 12610400 ]
        Hide
        Ben Becker added a comment - - edited
        Show
        Ben Becker added a comment - - edited Reviewboard: https://reviews.apache.org/r/14958/ Github: https://github.com/vrtx/incubator-drill/tree/DRILL-229-rebased NOTE: this patch was rebased on top of DRILL-230 .
        Hide
        Jacques Nadeau added a comment -

        merged in dd39a5b

        Show
        Jacques Nadeau added a comment - merged in dd39a5b
        Jacques Nadeau made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Jake Farrell made changes -
        Workflow no-reopen-closed, patch-avail [ 12815197 ] no-reopen-closed, patch-avail, testing [ 12860520 ]
        Jacques Nadeau made changes -
        Fix Version/s 0.4.0 [ 12324963 ]
        Tony Stevenson made changes -
        Workflow no-reopen-closed, patch-avail, testing [ 12860520 ] Drill workflow [ 12934232 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        68d 5h 12m 1 Jacques Nadeau 18/Nov/13 22:57

          People

          • Assignee:
            Ben Becker
            Reporter:
            Jacques Nadeau
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development