Cassandra
  1. Cassandra
  2. CASSANDRA-2869

CassandraStorage does not function properly when used multiple times in a single pig script due to UDFContext sharing issues

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 0.7.9, 0.8.2
    • Component/s: None
    • Labels:
      None

      Description

      CassandraStorage appears to have threading issues along the lines of those described at http://pig.markmail.org/message/oz7oz2x2dwp66eoz due to the sharing of the UDFContext.

      I believe the fix lies in implementing

      public void setStoreFuncUDFContextSignature(String signature)
          {
          }
      

      and then using that signature when getting the UDFContext.

      From the Pig manual:

      setStoreFunc!UDFContextSignature(): This method will be called by Pig both in the front end and back end to pass a unique signature to the Storer. The signature can be used to store into the UDFContext any information which the Storer needs to store between various method invocations in the front end and back end. The default implementation in StoreFunc has an empty body. This method will be called before other methods.

      1. 2869.txt
        4 kB
        Jeremy Hanna
      2. 2869-2.txt
        5 kB
        Jeremy Hanna

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        4d 10h 3m 1 Jeremy Hanna 12/Jul/11 04:30
        Patch Available Patch Available Resolved Resolved
        9d 15h 40m 1 Brandon Williams 21/Jul/11 20:11
        Aleksey Yeschenko made changes -
        Component/s Examples [ 12313082 ]
        Gavin made changes -
        Workflow patch-available, re-open possible [ 12751674 ] reopen-resolved, no closed status, patch-avail, testing [ 12757854 ]
        Gavin made changes -
        Workflow no-reopen-closed, patch-avail [ 12619544 ] patch-available, re-open possible [ 12751674 ]
        Hide
        Hudson added a comment -

        Integrated in Cassandra-0.7 #534 (See https://builds.apache.org/job/Cassandra-0.7/534/)
        Use a UDF-specific context signature.
        Patch by Jeremy Hanna, reviewed by brandonwilliams for CASSANDRA-2869

        brandonwilliams : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1149341
        Files :

        • /cassandra/branches/cassandra-0.7/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
        Show
        Hudson added a comment - Integrated in Cassandra-0.7 #534 (See https://builds.apache.org/job/Cassandra-0.7/534/ ) Use a UDF-specific context signature. Patch by Jeremy Hanna, reviewed by brandonwilliams for CASSANDRA-2869 brandonwilliams : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1149341 Files : /cassandra/branches/cassandra-0.7/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
        Brandon Williams made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Fix Version/s 0.7.9 [ 12317245 ]
        Fix Version/s 0.8.2 [ 12316645 ]
        Resolution Fixed [ 1 ]
        Hide
        Brandon Williams added a comment -

        Committed

        Show
        Brandon Williams added a comment - Committed
        Jeremy Hanna made changes -
        Attachment 2869-2.txt [ 12486314 ]
        Hide
        Jeremy Hanna added a comment -

        Removed that String. Also removed adding mutation twice and put in the nested exception in putNext into the IOException. We've been meaning to add those last two items to one of these tickets.

        Show
        Jeremy Hanna added a comment - Removed that String. Also removed adding mutation twice and put in the nested exception in putNext into the IOException. We've been meaning to add those last two items to one of these tickets.
        Hide
        Jeremy Hanna added a comment -

        Yes. I was about to post an updated patch last night but got sidetracked. Do you mind removing that if it's otherwise good to go? Otherwise I can do that later today.

        Show
        Jeremy Hanna added a comment - Yes. I was about to post an updated patch last night but got sidetracked. Do you mind removing that if it's otherwise good to go? Otherwise I can do that later today.
        Brandon Williams made changes -
        Reviewer brandon.williams
        Hide
        Brandon Williams added a comment -

        Looks like we can remove UDFCONTEXT_SCHEMA_KEY_PREFIX now too, no?

        Show
        Brandon Williams added a comment - Looks like we can remove UDFCONTEXT_SCHEMA_KEY_PREFIX now too, no?
        Jeremy Hanna made changes -
        Attachment 2869.txt [ 12486144 ]
        Jeremy Hanna made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Jeremy Hanna added a comment -

        Simple patch to use the load and store signatures instead of the udf context property keys we had been using. We're running this in our data pipeline and appears to work correctly. However, I haven't found evidence that the old way wasn't working - that seems to be more related to read consistency level we were using. But, this is probably the way we should be doing it, as it appears to be the Pig approach. Also there could be some corner cases that might trip up the current approach.

        Show
        Jeremy Hanna added a comment - Simple patch to use the load and store signatures instead of the udf context property keys we had been using. We're running this in our data pipeline and appears to work correctly. However, I haven't found evidence that the old way wasn't working - that seems to be more related to read consistency level we were using. But, this is probably the way we should be doing it, as it appears to be the Pig approach. Also there could be some corner cases that might trip up the current approach.
        Jeremy Hanna made changes -
        Assignee Jeremy Hanna [ jeromatron ]
        Grant Ingersoll made changes -
        Field Original Value New Value
        Affects Version/s 0.7.2 [ 12316100 ]
        Grant Ingersoll created issue -

          People

          • Assignee:
            Jeremy Hanna
            Reporter:
            Grant Ingersoll
            Reviewer:
            Brandon Williams
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development