Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15803

Separate out allow filtering scanning through a partition versus scanning over the table

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Triage Needed
    • Normal
    • Resolution: Unresolved
    • None
    • CQL/Syntax
    • None
    • All
    • None

    Description

      Currently allow filtering can mean two things in the spirit of "avoid operations that don't seek to a specific row or sequential rows of data."  First, it can mean scanning across the entire table to meet the criteria of the query.  That's almost always a bad thing and should be discouraged or disabled (see CASSANDRA-8303).  Second, it can mean filtering within a specific partition.  For example, in a query you could specify the full partition key and if you specify a criterion on a non-key field, it requires allow filtering.

      The second reason to require allow filtering is significantly less work to scan through a partition.  It is still extra work over seeking to a specific row and getting N sequential rows though.  So while an application developer and/or operator needs to be cautious about this second type, it's not necessarily a bad thing, depending on the table and the use case.

      I propose that we separate the way to specify allow filtering across an entire table from specifying allow filtering across a partition in a backwards compatible way.  One idea that was brought up in Slack in the cassandra-dev room was to have allow filtering mean the superset - scanning across the table.  Then if you want to specify that you only want to scan within a partition you would use something like

      ALLOW FILTERING [WITHIN PARTITION]

      So it will succeed if you specify non-key criteria within a single partition, but fail with a message to say it requires the full allow filtering.  This would allow for a backwards compatible full allow filtering while allowing a user to specify that they want to just scan within a partition, but error out if trying to scan a full table.

      This is potentially also related to the capability limitation framework by which operators could more granularly specify what features are allowed or disallowed per user, discussed in CASSANDRA-8303.  This way an operator could disallow the more general allow filtering while allowing the partition scan (or disallow them both at their discretion).

      Attachments

        Issue Links

          Activity

            People

              smiklosovic Stefan Miklosovic
              jeromatron Jeremy Hanna
              Stefan Miklosovic
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: