Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-3628

Make Pig/CassandraStorage delete functionality disabled by default and configurable

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 1.0.8
    • None

    Description

      Right now, there is a way to delete column with the CassandraStorage loadstorefunc. In practice it is a bad idea to have that enabled by default. A scenario: do an outer join and you don't have a value for something and then you write out to cassandra all of the attributes of that relation. You've just inadvertently deleted a column for all the rows that didn't have that value as a result of the outer join. It can be argued that you want to be careful with how you project after the join. However, I would think disabling by default and having a configurable property to enable it for the instances when you explicitly want to use it is the right plan.

      Fwiw, we had a bug in one of our scripts that did exactly as described above. It's good to fix the bug. It's bad to implicitly delete data.

      Attachments

        1. 3628-v2.txt
          4 kB
          Brandon Williams
        2. 3628.txt
          4 kB
          Jeremy Hanna

        Activity

          People

            brandon.williams Brandon Williams
            jeromatron Jeremy Hanna
            Brandon Williams
            Pavel Yaskevich
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: