Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: tez-branch
    • Component/s: tez
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Null is not equal in terms of join. Need to change comparator to do that. e2e Join_9, Join_10, Join_11 are manifests for this issue.

      1. PIG-3761-1.patch
        11 kB
        Daniel Dai
      2. PIG-3761-2.patch
        2 kB
        Daniel Dai

        Activity

        Hide
        rohini Rohini Palaniswamy added a comment -

        How does it work with MR?

        Show
        rohini Rohini Palaniswamy added a comment - How does it work with MR?
        Hide
        daijy Daniel Dai added a comment -

        In MR, we only use PigXXXRawComparator for sorting, not cogroup/join, that use PigWritableComparator, which has the same null comparative logic.

        Show
        daijy Daniel Dai added a comment - In MR, we only use PigXXXRawComparator for sorting, not cogroup/join, that use PigWritableComparator, which has the same null comparative logic.
        Hide
        mwagner Mark Wagner added a comment -

        What are your thoughts on handling this as part of the JoinPackager instead of in the Comparator? It seems like that might make the fact that we're handling null specially more explicit, which could help maintainability in the future.

        Are there any cases where we'll have the same index from two different relations? We haven't really been using the indices explicitly in the Tez branch, so I'm not sure how reliable they are.

        Show
        mwagner Mark Wagner added a comment - What are your thoughts on handling this as part of the JoinPackager instead of in the Comparator? It seems like that might make the fact that we're handling null specially more explicit, which could help maintainability in the future. Are there any cases where we'll have the same index from two different relations? We haven't really been using the indices explicitly in the Tez branch, so I'm not sure how reliable they are.
        Hide
        daijy Daniel Dai added a comment -

        We do set index correctly in tez. But what you suggest should work. Let me try.

        Show
        daijy Daniel Dai added a comment - We do set index correctly in tez. But what you suggest should work. Let me try.
        Hide
        daijy Daniel Dai added a comment -

        Changed JoinPackager handling to deal with the issue.

        Show
        daijy Daniel Dai added a comment - Changed JoinPackager handling to deal with the issue.
        Hide
        cheolsoo Cheolsoo Park added a comment -

        +1. LGTM.

        Show
        cheolsoo Cheolsoo Park added a comment - +1. LGTM.
        Hide
        daijy Daniel Dai added a comment -

        Patch committed to tez. Thanks Cheolsoo for review!

        Show
        daijy Daniel Dai added a comment - Patch committed to tez. Thanks Cheolsoo for review!

          People

          • Assignee:
            daijy Daniel Dai
            Reporter:
            daijy Daniel Dai
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development