Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4806

UDFContext can be reset in the middle during Tez input and output initialization

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.16.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We reinitialize UDFContext ThreadLocal itself in PigProcessor.initialize(). PigProcessor.initialize() is run in parallel with threads that do MRInput.initialize() and MROutput.initialize(). It can overwrite the initialized values in the input and output half way through. This can lead to exceptions if the property stored in UDFContext is mandatory and task will be retried which is ok. If it is not mandatory and it ends up getting null when it actually had a value, wrong data can be produced silently.

        Attachments

        1. PIG-4806-1.patch
          6 kB
          Rohini Palaniswamy

          Activity

            People

            • Assignee:
              rohini Rohini Palaniswamy
              Reporter:
              rohini Rohini Palaniswamy
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: