Details
-
New Feature
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
Description
A typical use case for a collection could be to store a bunch of addresses in a user profile. An address could typically be composed of a few properties: say a street, a city, a postal code and maybe a few phone numbers associated to it.
To model that currently with collections, you might use a map<string, blob>, where the map key could be a string identifying the address, and the value would be all the infos of an address serialized manually (you can use text instead of blob and shove everything in a string if you prefer but the principle is the same).
This ticket suggests to make this more user friendly by allowing:
CREATE TYPE address ( street text, city text, zip_code int, phones set<text> ) CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses map<string, address> )
Under the hood, that type declaration would just be metadata on top of CompositeType (which does mean a limitation would be that we wouldn't allow re-ordering or removal of fields in a custom TYPE). Namely, the address type would be in practice a CompositeType(UTF8Type, UTF8Type, Int32Type, SetType(UTF8Type)) + some metadata that records the name of each component. In other words, this would mostly be user-friendly syntactic sugar to create composite blobs.
I'll note that this would also be useful outside collections, as it might sometimes be more efficient/useful to have such simple composite blob. For instance, you could imagine to have a:
CREATE TYPE fullname ( firstname text, lastname text )
and to rewrite the users table above as
CREATE TABLE users ( id uuid PRIMARY KEY, name fullname, addresses map<string, address> )
In terms of inserts we'd need a syntax for those new "struct". Could be:
INSERT INTO users (id, name) VALUES (2ad..., { firstname: 'Paul', lastname: 'smith'}); UPDATE users SET addresses = address + { 'home': { street: '...', city: 'SF', zip_code: 94102, phones: {} } } WHERE id=2ad...;
where the difference with a map is that the "key" would be a column name (in the CQL3 sense), not a value/literal. Though we might find that a bit confusing and find some other syntax.
On the query side, we could optionally allow things like:
SELECT name.firstname, name.lastname FROM users WHERE id=2ad...;
One open question however is what type do we send back in the result set
for a query like:
SELECT name FROM users WHERE id=2ad...;
We could:
- return just that it's the user defined type named address, but that imply the client has to query the cluster metadata to find out the definition of the type.
- return the full definition of the type every time.
I also note that client side, it might be a tad harder to support such types cleanly in statically type languages than in dynamically typed ones, but that's not the end of the world either.
Attachments
Attachments
Issue Links
- relates to
-
CASSANDRA-6705 ALTER TYPE <type> RENAME <field> fails sometime with java.lang.AssertionError: null
- Resolved
-
CASSANDRA-6304 Better handling of authorization for User Types
- Resolved
-
CASSANDRA-6305 cqlsh support for User Types
- Resolved
-
CASSANDRA-6312 Create dtest suite for user types
- Resolved
-
CASSANDRA-6438 Make user types keyspace scoped
- Resolved