Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
Description
The Documentation here https://cassandra.apache.org/doc/latest/cassandra/getting-started/vector-search-quickstart.html
doesn't work.
Some example errors, when creating the comments_vs table
instaclustr@cqlsh:cycling> CREATE TABLE IF NOT EXISTS cycling.comments_vs ( ... record_id timeuuid, ... id uuid, ... commenter text, ... comment text, ... comment_vector VECTOR <FLOAT, 5>; SyntaxException: line 6:34 mismatched input ';' expecting ')' (...comment_vector VECTOR <FLOAT, 5>[;]) instaclustr@cqlsh:cycling> created_at timestamp, ... PRIMARY KEY (id, created_at) ... ) ... WITH CLUSTERING ORDER BY (created_at DESC); SyntaxException: line 1:0 no viable alternative at input 'created_at' ([created_at]...) instaclustr@cqlsh:cycling>
Which then breaks all the subsequent commands, some of the later inserts and SELECTS need work even after repairing.
There's a few errors in the CQL commands and table definitions, I managed to get it working in the below CQL.
CREATE KEYSPACE IF NOT EXISTS demo WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : '1' }; CREATE TABLE IF NOT EXISTS demo.comments_vs ( record_id timeuuid, id uuid, commenter text, comment text, comment_vector VECTOR <FLOAT, 5>, created_at timestamp, PRIMARY KEY (id, created_at) ); WITH CLUSTERING ORDER BY (created_at DESC);CREATE INDEX IF NOT EXISTS ann_index ON demo.comments_vs(comment_vector) USING 'sai'; INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector) VALUES ( now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-03-21 13:11:09.999-0800', 'Second rest stop was out of water', 'Alex', [0.99, 0.5, 0.99, 0.1, 0.34] ); INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector) VALUES ( now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-04-01 06:33:02.16-0800', 'LATE RIDERS SHOULD NOT DELAY THE START', 'Alex', [0.9, 0.54, 0.12, 0.1, 0.95] ); INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector) VALUES ( now(), c7fceba0-c141-4207-9494-a29f9809de6f, totimestamp(now()), 'The gift certificate for winning was the best', 'Amy', [0.13, 0.8, 0.35, 0.17, 0.03] ); INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector) VALUES ( now(), c7fceba0-c141-4207-9494-a29f9809de6f, '2017-02-17 12:43:20.234+0400', 'Glad you ran the race in the rain', 'Amy', [0.3, 0.34, 0.2, 0.78, 0.25] ); INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector) VALUES ( now(), c7fceba0-c141-4207-9494-a29f9809de6f, '2017-03-22 5:16:59.001+0400', 'Great snacks at all reststops', 'Amy', [0.1, 0.4, 0.1, 0.52, 0.09] ); INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector) VALUES ( now(), c7fceba0-c141-4207-9494-a29f9809de6f, '2017-04-01 17:43:08.030+0400', 'Last climb was a killer', 'Amy', [0.3, 0.75, 0.2, 0.2, 0.5] ); SELECT * FROM demo.comments_vs ORDER BY comment_vector ANN OF [0.15, 0.1, 0.1, 0.35, 0.55] LIMIT 3; SELECT comment, similarity_cosine(comment_vector, [0.2, 0.15, 0.3, 0.2, 0.05]) FROM demo.comments_vs ORDER BY comment_vector ANN OF [0.1, 0.15, 0.3, 0.12, 0.05] LIMIT 1;
Just raising a ticket to link for a website PR.
Attachments
Issue Links
- links to
Raised https://github.com/apache/cassandra/pull/2902/files with the corrected CQL (and I fixed the keyspace names to use cycling instead of demo)