Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3839 Umbrella jira for Pig on Tez Performance Improvements
  3. PIG-4785

Optimize multi-query plan for diamond shape edges

Add voteVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 0.18.0
    • tez
    • None

    Description

      If diamond shaped edge (two edges going to same vertex), we do not merge into Split. Lot of data is transferred because of that. It can be optimized to merge the operator into the Split, but still have a POValueInputTez->POValueOutputTez vertex which just will be used to redirect the input to avoid the diamond shaped edge. This will allow filtering and other processing to happen in the Split operator itself and the data transferred to the routing vertex will be minimal.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rohini Rohini Palaniswamy
            rohini Rohini Palaniswamy

            Dates

              Created:
              Updated:

              Slack

                Issue deployment