Pig
  1. Pig
  2. PIG-1874

Make PigServer work in a multithreading environment

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.9.0
    • Component/s: impl
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      This means that PigServers should work if one creates separate PigServer instances for each thread (PigServers are not synchronized).

      1. PIG-1874.patch
        10 kB
        Richard Ding
      2. PIG-1874_1.patch
        15 kB
        Richard Ding

        Activity

        Richard Ding created issue -
        Hide
        Santhosh Srinivasan added a comment -

        +1

        Show
        Santhosh Srinivasan added a comment - +1
        Hide
        Richard Ding added a comment -

        Attaching patch for review.

        This patch removed the static variables from PigServer and PigContext classes. It also made UDFContext instance thread local.

        To avoid sharing PigContext object, users should use following constructors to create PigServer instance in each thread:

        public PigServer(ExecType execType) throws ExecException;
        
        public PigServer(ExecType execType, Properties properties) throws ExecException;
        
        Show
        Richard Ding added a comment - Attaching patch for review. This patch removed the static variables from PigServer and PigContext classes. It also made UDFContext instance thread local. To avoid sharing PigContext object, users should use following constructors to create PigServer instance in each thread: public PigServer(ExecType execType) throws ExecException; public PigServer(ExecType execType, Properties properties) throws ExecException;
        Richard Ding made changes -
        Field Original Value New Value
        Attachment PIG-1874.patch [ 12473221 ]
        Hide
        Alan Gates added a comment -

        Changes looks good. What kind of testing are we doing to make sure we can have PigServers running in multiple threads with no clashes?

        Show
        Alan Gates added a comment - Changes looks good. What kind of testing are we doing to make sure we can have PigServers running in multiple threads with no clashes?
        Hide
        Richard Ding added a comment -

        Attaching patch that added a unit test for UDFContext. There also are existing unit tests for parallel execution of bound script in embedded Pig.

        Test-patch output:

             [exec] -1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 6 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec] 
             [exec]     -1 release audit.  The applied patch generated 541 release audit warnings (more than the trunk's current 540 warnings).
        

        The release audit warning is html releted.

        Show
        Richard Ding added a comment - Attaching patch that added a unit test for UDFContext. There also are existing unit tests for parallel execution of bound script in embedded Pig. Test-patch output: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] -1 release audit. The applied patch generated 541 release audit warnings (more than the trunk's current 540 warnings). The release audit warning is html releted.
        Richard Ding made changes -
        Attachment PIG-1874_1.patch [ 12473425 ]
        Hide
        Richard Ding added a comment -

        Patch committed to trunk.

        Show
        Richard Ding added a comment - Patch committed to trunk.
        Richard Ding made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Resolution Fixed [ 1 ]
        Hide
        Vincent BARAT added a comment -

        Thanks guys ! You save my life with this patch !

        Show
        Vincent BARAT added a comment - Thanks guys ! You save my life with this patch !
        Hide
        Thomas Memenga added a comment -

        Be aware that the current implementation seems to have a memory leak if you reuse the threads.

        I have executed 1000s of (very small) pig jobs in parallel using a java.util.ExecutorService (fixed size thread pool)
        and I ran into memory problems after 3-4 hours. (Statistics related ?)

        My workaround: Spawning a new thread for each PigServer and let the garbage collector do the clean up.

        Show
        Thomas Memenga added a comment - Be aware that the current implementation seems to have a memory leak if you reuse the threads. I have executed 1000s of (very small) pig jobs in parallel using a java.util.ExecutorService (fixed size thread pool) and I ran into memory problems after 3-4 hours. (Statistics related ?) My workaround: Spawning a new thread for each PigServer and let the garbage collector do the clean up.
        Olga Natkovich made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        9d 22h 30m 1 Richard Ding 11/Mar/11 20:09
        Resolved Resolved Closed Closed
        145d 4h 25m 1 Olga Natkovich 04/Aug/11 01:34

          People

          • Assignee:
            Richard Ding
            Reporter:
            Richard Ding
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development