Details
-
Bug
-
Status: In Progress
-
Normal
-
Resolution: Unresolved
-
None
-
Normal
Description
We currently disallow creation of a view that has the exact same primary key as the base where no clustering keys are present, however a potential use case would be a view where part of the PK is filtered so as to have a subset of data in the view which is faster for range queries. We actually currently allow this, but only if you have a clustering key defined. If you only have a partitioning key it's not possible.
From the mailing list, the below example works:
CREATE TABLE users ( site_id int, user_id text, n int, data set<frozen<text>>, PRIMARY KEY ((site_id, user_id), n)); user data is updated and read by PK and sometimes I have to fetch all user for some specific site_id. It appeared that full scan by token(site_id,user_id) filtered by WHERE site_id = <some id> works much slower than unfiltered full scan on CREATE MATERIALIZED VIEW users_1 AS SELECT site_id, user_id, n, data FROM users WHERE site_id = 1 AND user_id IS NOT NULL AND n IS NOT NULL PRIMARY KEY ((site_id, user_id), n);
However the following does not:
CREATE TABLE users ( site_id int, user_id text, data set<text>, PRIMARY KEY ((site_id, user_id))); CREATE MATERIALIZED VIEW users_1 AS SELECT site_id, user_id, data FROM users WHERE site_id = 1 AND user_id IS NOT NULL PRIMARY KEY ((site_id, user_id)); InvalidRequest: Error from server: code=2200 [Invalid query] message="No columns are defined for Materialized View other than primary key"
This is because if the clustering key is empty we assume they've only defined the primary key in the partition key and we haven't accounted for this use case.
On that note, we also don't allow the following narrowing of the partition key:
CREATE TABLE kurt.base ( id int, uid text, data text, PRIMARY KEY (id, uid) ) CREATE MATERIALIZED VIEW kurt.mv2 AS SELECT * from kurt.base where id IS NOT NULL and uid='1' PRIMARY KEY ((id, uid)); InvalidRequest: Error from server: code=2200 [Invalid query] message="No columns are defined for Materialized View other than primary key"
But we do allow the following, which works because there is still a clustering key, despite not changing the PK.
CREATE MATERIALIZED VIEW kurt.mv2 AS SELECT * from kurt.base where id IS NOT NULL and uid='1' PRIMARY KEY (id, uid);
And we also allow the following, which is a narrowing of the partition key as above, but with an extra clustering key.
create table kurt.base3 (id int, uid int, clus1 int, clus2 int, data text, PRIMARY KEY ((id, uid), clus1, clus2)); CREATE MATERIALIZED VIEW kurt.mv4 AS SELECT * from kurt.base3 where id IS NOT NULL and uid IS NOT NULL and clus1 IS NOT NULL AND clus2 IS NOT NULL PRIMARY KEY ((id, uid, clus1), clus2);
I think supporting these cases is trivial and mostly already handled in the underlying MV write path, so we might be able to get away with just a simple change of this condition.