Pig
  1. Pig
  2. PIG-1891

Enable StoreFunc to make intelligent decision based on job success or failure

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.11
    • Component/s: None
    • Labels:
    • Release Note:
      Hide
      This adds a new method, cleanupOnSuccess, to the StoreFunc interface, and thus will cause backward compatibility issues for users who directly implement this interface. Most store functions implement StoreFuncImpl, which will shield them from this issue as it implements the new method.
      Show
      This adds a new method, cleanupOnSuccess, to the StoreFunc interface, and thus will cause backward compatibility issues for users who directly implement this interface. Most store functions implement StoreFuncImpl, which will shield them from this issue as it implements the new method.

      Description

      We are in the process of using PIG for various data processing and component integration. Here is where we feel pig storage funcs lack:

      They are not aware if the over all job has succeeded. This creates a problem for storage funcs which needs to "upload" results into another system:

      DB, FTP, another file system etc.

      I looked at the DBStorage in the piggybank (http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/DBStorage.java?view=markup) and what I see is essentially a mechanism which for each task does the following:

      1. Creates a recordwriter (in this case open connection to db)
      2. Open transaction.
      3. Writes records into a batch
      4. Executes commit or rollback depending if the task was successful.

      While this aproach works great on a task level, it does not work at all on a job level.

      If certain tasks will succeed but over job will fail, partial records are going to get uploaded into the DB.

      Any ideas on the workaround?

      Our current workaround is fairly ugly: We created a java wrapper that launches pig jobs and then uploads to DB's once pig's job is successful. While the approach works, it's not really integrated into pig.

      1. PIG-1891-3.patch
        10 kB
        Eli Reisman
      2. PIG-1891-2.patch
        9 kB
        Eli Reisman
      3. PIG-1891-1.patch
        9 kB
        Eli Reisman

        Issue Links

          Activity

          Alex Rovner created issue -
          Alan Gates made changes -
          Field Original Value New Value
          Priority Major [ 3 ] Minor [ 4 ]
          Eli Reisman made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Affects Version/s 0.10.0 [ 12316246 ]
          Labels patch
          Eli Reisman made changes -
          Attachment PIG-1891-1.patch [ 12537427 ]
          Eli Reisman made changes -
          Attachment PIG-1891-1.patch [ 12537427 ]
          Eli Reisman made changes -
          Attachment PIG-1891-1.patch [ 12537432 ]
          Eli Reisman made changes -
          Attachment PIG-1891-2.patch [ 12540989 ]
          Alan Gates made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Eli Reisman made changes -
          Attachment PIG-1891-3.patch [ 12543442 ]
          Alan Gates made changes -
          Assignee Eli Reisman [ initialcontext ]
          Alan Gates made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Release Note This adds a new method, cleanupOnSuccess, to the StoreFunc interface, and thus will cause backward compatibility issues for users who directly implement this interface. Most store functions implement StoreFuncImpl, which will shield them from this issue as it implements the new method.
          Fix Version/s 0.11 [ 12318878 ]
          Resolution Fixed [ 1 ]
          Alan Gates made changes -
          Link This issue is related to PIG-2935 [ PIG-2935 ]
          Bill Graham made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Nezih Yigitbasi made changes -
          Link This issue is related to PIG-3770 [ PIG-3770 ]

            People

            • Assignee:
              Eli Reisman
              Reporter:
              Alex Rovner
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development