[CASSANDRA-5590] User defined types for CQL3 - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 2.1 beta1
Component/s: Legacy/CQL
Labels:
None

Description

A typical use case for a collection could be to store a bunch of addresses in a user profile. An address could typically be composed of a few properties: say a street, a city, a postal code and maybe a few phone numbers associated to it.

To model that currently with collections, you might use a map<string, blob>, where the map key could be a string identifying the address, and the value would be all the infos of an address serialized manually (you can use text instead of blob and shove everything in a string if you prefer but the principle is the same).

This ticket suggests to make this more user friendly by allowing:

CREATE TYPE address (
  street text,
  city text,
  zip_code int,
  phones set<text>
)

CREATE TABLE users (
  id uuid PRIMARY KEY,
  name text,
  addresses map<string, address>
)

Under the hood, that type declaration would just be metadata on top of CompositeType (which does mean a limitation would be that we wouldn't allow re-ordering or removal of fields in a custom TYPE). Namely, the address type would be in practice a CompositeType(UTF8Type, UTF8Type, Int32Type, SetType(UTF8Type)) + some metadata that records the name of each component. In other words, this would mostly be user-friendly syntactic sugar to create composite blobs.

I'll note that this would also be useful outside collections, as it might sometimes be more efficient/useful to have such simple composite blob. For instance, you could imagine to have a:

CREATE TYPE fullname (
  firstname text,
  lastname text
)

and to rewrite the users table above as

CREATE TABLE users (
  id uuid PRIMARY KEY,
  name fullname,
  addresses map<string, address>
)

In terms of inserts we'd need a syntax for those new "struct". Could be:

INSERT INTO users (id, name)
           VALUES (2ad..., { firstname: 'Paul', lastname: 'smith'});
UPDATE users
   SET addresses = address + { 'home': { street: '...', city: 'SF', zip_code: 94102, phones: {} } }
   WHERE id=2ad...;

where the difference with a map is that the "key" would be a column name (in the CQL3 sense), not a value/literal. Though we might find that a bit confusing and find some other syntax.

On the query side, we could optionally allow things like:

SELECT name.firstname, name.lastname FROM users WHERE id=2ad...;

One open question however is what type do we send back in the result set
for a query like:

SELECT name FROM users WHERE id=2ad...;

We could:

return just that it's the user defined type named address, but that imply the client has to query the cluster metadata to find out the definition of the type.
return the full definition of the type every time.

I also note that client side, it might be a tad harder to support such types cleanly in statically type languages than in dynamically typed ones, but that's not the end of the world either.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hadoop-pr000007-dumps-updated-2019.pdf
31/Jan/19 08:00
167 kB
carolyncoleman
ocd-and-corrections-patch.txt
31/Oct/13 12:20
14 kB
Aleksey Yeschenko

Issue Links

relates to

CASSANDRA-6705 ALTER TYPE <type> RENAME <field> fails sometime with java.lang.AssertionError: null

Resolved

CASSANDRA-6304 Better handling of authorization for User Types

Resolved

CASSANDRA-6305 cqlsh support for User Types

Resolved

CASSANDRA-6312 Create dtest suite for user types

Resolved

CASSANDRA-6438 Make user types keyspace scoped

Resolved

Activity

People

Assignee:: Sylvain Lebresne

Reporter:: Sylvain Lebresne

Authors:: Sylvain Lebresne

Reviewers:: Aleksey Yeschenko

Votes:: 1 Vote for this issue

Watchers:: 19 Start watching this issue

Dates

Created:: 23/May/13 17:40

Updated:: 16/Apr/19 09:32

Resolved:: 04/Nov/13 15:09