Cassandra
  1. Cassandra
  2. CASSANDRA-3628

Make Pig/CassandraStorage delete functionality disabled by default and configurable

    Details

    • Type: Task Task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 1.0.8
    • Component/s: None
    • Labels:

      Description

      Right now, there is a way to delete column with the CassandraStorage loadstorefunc. In practice it is a bad idea to have that enabled by default. A scenario: do an outer join and you don't have a value for something and then you write out to cassandra all of the attributes of that relation. You've just inadvertently deleted a column for all the rows that didn't have that value as a result of the outer join. It can be argued that you want to be careful with how you project after the join. However, I would think disabling by default and having a configurable property to enable it for the instances when you explicitly want to use it is the right plan.

      Fwiw, we had a bug in one of our scripts that did exactly as described above. It's good to fix the bug. It's bad to implicitly delete data.

      1. 3628-v2.txt
        4 kB
        Brandon Williams
      2. 3628.txt
        4 kB
        Jeremy Hanna

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Brandon Williams
            Reporter:
            Jeremy Hanna
            Reviewer:
            Pavel Yaskevich
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development