Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-19030

Vector Quickstart Documentation does not work

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 5.0-beta1, 5.0, 5.1
    • Documentation
    • None

    Description

      The Documentation here https://cassandra.apache.org/doc/latest/cassandra/getting-started/vector-search-quickstart.html

      doesn't work.

      Some example errors, when creating the comments_vs table

      instaclustr@cqlsh:cycling> CREATE TABLE IF NOT EXISTS cycling.comments_vs (
                 ...   record_id timeuuid,
                 ...   id uuid,
                 ...   commenter text,
                 ...   comment text,
                 ...   comment_vector VECTOR <FLOAT, 5>;
      SyntaxException: line 6:34 mismatched input ';' expecting ')' (...comment_vector VECTOR <FLOAT, 5>[;])
      instaclustr@cqlsh:cycling>   created_at timestamp,
                 ...   PRIMARY KEY (id, created_at)
                 ... )
                 ... WITH CLUSTERING ORDER BY (created_at DESC);
      SyntaxException: line 1:0 no viable alternative at input 'created_at' ([created_at]...)
      instaclustr@cqlsh:cycling> 

      Which then breaks all the subsequent commands, some of the later inserts and SELECTS need work even after repairing.

      There's a few errors in the CQL commands and table definitions, I managed to get it working in the below CQL.

      CREATE KEYSPACE IF NOT EXISTS demo
         WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : '1' };
         
      CREATE TABLE IF NOT EXISTS demo.comments_vs (
        record_id timeuuid,
        id uuid,
        commenter text,
        comment text,
        comment_vector VECTOR <FLOAT, 5>,
        created_at timestamp,
        PRIMARY KEY (id, created_at)
      );
      WITH CLUSTERING ORDER BY (created_at DESC);CREATE INDEX IF NOT EXISTS ann_index
        ON demo.comments_vs(comment_vector) USING 'sai';
        
        
      INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector)
         VALUES (
            now(),
            e7ae5cf3-d358-4d99-b900-85902fda9bb0,
            '2017-03-21 13:11:09.999-0800',
            'Second rest stop was out of water',
            'Alex',
            [0.99, 0.5, 0.99, 0.1, 0.34]
      );
      
      INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector)
         VALUES (
            now(),
            e7ae5cf3-d358-4d99-b900-85902fda9bb0,
            '2017-04-01 06:33:02.16-0800',
            'LATE RIDERS SHOULD NOT DELAY THE START',
            'Alex',
            [0.9, 0.54, 0.12, 0.1, 0.95]
      );
      
      INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector)
         VALUES (
            now(),
            c7fceba0-c141-4207-9494-a29f9809de6f,
            totimestamp(now()),
            'The gift certificate for winning was the best',
            'Amy',
            [0.13, 0.8, 0.35, 0.17, 0.03]
      );
      
      INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector)
         VALUES (
            now(),
            c7fceba0-c141-4207-9494-a29f9809de6f,
            '2017-02-17 12:43:20.234+0400',
            'Glad you ran the race in the rain',
            'Amy',
            [0.3, 0.34, 0.2, 0.78, 0.25]
      );
      
      INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector)
         VALUES (
            now(),
            c7fceba0-c141-4207-9494-a29f9809de6f,
            '2017-03-22 5:16:59.001+0400',
            'Great snacks at all reststops',
            'Amy',
            [0.1, 0.4, 0.1, 0.52, 0.09]
      );
      
      INSERT INTO demo.comments_vs (record_id, id, created_at, comment, commenter, comment_vector)
         VALUES (
            now(),
            c7fceba0-c141-4207-9494-a29f9809de6f,
            '2017-04-01 17:43:08.030+0400',
            'Last climb was a killer',
            'Amy',
            [0.3, 0.75, 0.2, 0.2, 0.5]
      );
      
      SELECT * FROM demo.comments_vs
          ORDER BY comment_vector ANN OF [0.15, 0.1, 0.1, 0.35, 0.55]
          LIMIT 3;
          
      SELECT comment, similarity_cosine(comment_vector, [0.2, 0.15, 0.3, 0.2, 0.05])
          FROM demo.comments_vs
          ORDER BY comment_vector ANN OF [0.1, 0.15, 0.3, 0.12, 0.05]
          LIMIT 1; 

      Just raising a ticket to link for a website PR.

      Attachments

        Issue Links

          Activity

            People

              Jfleming Jackson Fleming
              Jfleming Jackson Fleming
              Jackson Fleming
              Lorina Poland, Michael Semb Wever
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m