Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4587

Applying isFirstReduceOfKey for Skewed left outer join skips records

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.15.0
    • Fix Version/s: 0.16.0, 0.15.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      PIG-4377 introduced isFirstReduceOfKey to avoid extra records in case of over sampling. But the issue can occur only in the case of right outer join. But it is added to the plan in MRCompiler and TezCompiler (PIG-4580) for both left and right outer joins. We need to remove that extra check for right outer join. It is unnecessary performance penalty.

        Attachments

        1. PIG-4587-1.patch
          15 kB
          Rohini Palaniswamy
        2. PIG-4587-1-branch-0.15.patch
          10 kB
          Rohini Palaniswamy

          Issue Links

            Activity

              People

              • Assignee:
                daijy Jianyong Dai
                Reporter:
                rohini Rohini Palaniswamy
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: