Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-9406

Add Option to Not Validate Atoms During Scrub

    XMLWordPrintableJSON

Details

    • Low

    Description

      In Scrubber, the instantiation of SSTableIdentityIterator hardcodes checkData to true. This should be made configurable when running scrub via JMX or StandaloneScrubber.

      Since inbound data is not validated, Scrub without this option will throw away data that is not corrupt, but "misrepresented" (e.g. an int is stored but validator = LongType), while Cassandra and application clients will happily continue to read and write data with this misrepresentation (although some care may need to be taken on the application side). Scrub will throw these rows out leading to a large amount of data loss.

      In these applications it is desirable for scrub to check for row/file corruption but not validate the column values (which can result in a large percentage of data being thrown away). This would be made possible by adding such a flag to disable validation in the SSTableIdentityIterator

      Attachments

        Activity

          People

            jwest Jordan West
            jwest Jordan West
            Jordan West
            Yuki Morishita
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: