Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-661

Implement a non-sorted partitioned output

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.5.0
    • None
    • None

    Description

      When implementing Pig union, we need to gather data from two or more upstream vertexes without sorting. The vertex itself might consists of several tasks. Ideally, it should use OnFileUnorderedKVOutput with DataMovementType.SCATTER_GATHER. However, this combination does not work according to hitesh. We need to implement that. Also, key is meaningless in this scenario, we just want to evenly distribute the output records to tasks.

      Attachments

        1. TEZ-661.1.txt
          76 kB
          Siddharth Seth
        2. TEZ-661.2.txt
          78 kB
          Siddharth Seth

        Issue Links

          Activity

            People

              sseth Siddharth Seth
              daijy Daniel Dai
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: