Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-2658

Pig + CassandraStorage should work when trying to cast data after it's loaded

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Low
    • Resolution: Not A Problem
    • None
    • None
    • Low

    Description

      We currently do a lot with pig + cassandra, but one thing I've found is that currently it's very touchy with data that comes from Cassandra for some reason. For example, if I try to a SUM of data that has not been validated as an LongType in Cassandra, it borks. See this schema script for Cassandra - https://github.com/jeromatron/pygmalion/blob/master/cassandra/example_data.txt - and remove the validation on the num_heads data type and try to SUM that over the data and it gives data type errors. (It breaks with the num_heads validation removed and with or without the default_validation class being set.)

      We currently do analysis over data that is either just String (UTF8) data or that we have validated, so it works for us. However, I've seen a couple of people trying to use Cassandra with Pig that have had issues because of this. One of the tenets of pig is that it will eat anything and it kind of goes against this if the load/store somehow interferes with that. So in essence, I think this is a big deal for those wanting to use pig with cassandra in the ways that pig is normally used.

      Attachments

        Activity

          People

            brandon.williams Brandon Williams
            jeromatron Jeremy Hanna
            Brandon Williams
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: