Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-9406

Add Option to Not Validate Atoms During Scrub

    XMLWordPrintableJSON

    Details

    • Severity:
      Low

      Description

      In Scrubber, the instantiation of SSTableIdentityIterator hardcodes checkData to true. This should be made configurable when running scrub via JMX or StandaloneScrubber.

      Since inbound data is not validated, Scrub without this option will throw away data that is not corrupt, but "misrepresented" (e.g. an int is stored but validator = LongType), while Cassandra and application clients will happily continue to read and write data with this misrepresentation (although some care may need to be taken on the application side). Scrub will throw these rows out leading to a large amount of data loss.

      In these applications it is desirable for scrub to check for row/file corruption but not validate the column values (which can result in a large percentage of data being thrown away). This would be made possible by adding such a flag to disable validation in the SSTableIdentityIterator

        Attachments

          Activity

            People

            • Assignee:
              jrwest Jordan West
              Reporter:
              jrwest Jordan West
              Authors:
              Jordan West
              Reviewers:
              Yuki Morishita
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: