Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-981

Merge join should restrict join key expressions to simple projects

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 0.4.0
    • None
    • None
    • None

    Description

      Currently merge join allows join key expressions to be arbitrary expressions with the assumption that the expressions keep the sort order. Since currently only ascending sort order is supported, the code checks at run times for sort order and catches the case where sort order is broken because the join key expression is not order preserving. However there is a reason we should restrict the join keys to projection of columns only:
      PIG-953 will enable pig to perform merge join to work with loaders and store functions which can internally index sorted data. These store functions can only create an index (and hence lookup on the index) on raw data columns (and not expressions on the columns).
      Hopefully this does not downgrade the usability of merge join much since if the expressions can always be applied post join on the join columns and since the expressions are order preserving they do not affect the outcome of the join.

      Attachments

        Issue Links

          Activity

            People

              pkamath Pradeep Kamath
              pkamath Pradeep Kamath
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: