Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 0.7.4
    • Component/s: Examples, Hadoop
    • Labels:
      None

      Description

      Now that we have a ColumnFamilyOutputFormat, we can write data back to cassandra in mapreduce jobs, however we can only do this in java. It would be nice if pig could also output to cassandra.

      1. 0002-Fix-build-bin-script.txt
        2 kB
        Brandon Williams
      2. 0003-StoreFunc_with_deletion.txt
        12 kB
        Eldon Stegall
      3. 0001-add-storage-ability-to-pig-CassandraStorage.txt
        11 kB
        Brandon Williams

        Activity

        Hide
        Brandon Williams added a comment - - edited

        Patch to allow storing output to cassandra, so long as the output matches what you would get from loading either a CF or SCF. Note that I had to custom build pig with jackson 1.4, since it includes its own jackson 1.0.1 which avro does not seem to like.

        Show
        Brandon Williams added a comment - - edited Patch to allow storing output to cassandra, so long as the output matches what you would get from loading either a CF or SCF. Note that I had to custom build pig with jackson 1.4, since it includes its own jackson 1.0.1 which avro does not seem to like.
        Hide
        Brandon Williams added a comment -

        The way to rebuild pig is to edit ivy/libraries.properties and bump the jackson version to 1.4.0 then run ant.

        Show
        Brandon Williams added a comment - The way to rebuild pig is to edit ivy/libraries.properties and bump the jackson version to 1.4.0 then run ant.
        Hide
        Eldon Stegall added a comment -

        Should add deletion to the storage function. This patch applies cleanly to the 0.7.0 tag.

        Show
        Eldon Stegall added a comment - Should add deletion to the storage function. This patch applies cleanly to the 0.7.0 tag.
        Hide
        Brandon Williams added a comment - - edited

        Couldn't get Eldon's patch to apply, but updated 0001 with his changes to add deletions and explicitly cast String, as well as other cleanups. Only 0001 and 0002 are part of the patchset, 0003 is an outdated conglomeration of the two now.

        Show
        Brandon Williams added a comment - - edited Couldn't get Eldon's patch to apply, but updated 0001 with his changes to add deletions and explicitly cast String, as well as other cleanups. Only 0001 and 0002 are part of the patchset, 0003 is an outdated conglomeration of the two now.
        Hide
        Jeremy Hanna added a comment -

        +1

        2 comments:

        • I really like exceptions for invalid configuration - e.g. "PIG_INITIAL_ADDRESS environment variable not set"
        • why not just have getOutputFormat just return new ColumnFamilyOutputFormat();
        Show
        Jeremy Hanna added a comment - +1 2 comments: I really like exceptions for invalid configuration - e.g. "PIG_INITIAL_ADDRESS environment variable not set" why not just have getOutputFormat just return new ColumnFamilyOutputFormat();
        Hide
        Brandon Williams added a comment -

        Committed with the second point change (also in the input format) and updated README

        Show
        Brandon Williams added a comment - Committed with the second point change (also in the input format) and updated README
        Hide
        Hudson added a comment -

        Integrated in Cassandra-0.7 #345 (See https://hudson.apache.org/hudson/job/Cassandra-0.7/345/)
        Pig storefunc.
        Patch by brandonwilliams, reviewed by Jeremy Hanna for CASSANDRA-1828.

        Show
        Hudson added a comment - Integrated in Cassandra-0.7 #345 (See https://hudson.apache.org/hudson/job/Cassandra-0.7/345/ ) Pig storefunc. Patch by brandonwilliams, reviewed by Jeremy Hanna for CASSANDRA-1828 .

          People

          • Assignee:
            Brandon Williams
            Reporter:
            Brandon Williams
            Reviewer:
            Jeremy Hanna
          • Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 32h
              32h
              Remaining:
              Remaining Estimate - 32h
              32h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development