Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-6151

CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Low
    • Resolution: Won't Fix
    • None
    • None
    • None
    • Low

    Description

      From http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546

      The user was attempting to load a single partition using a where clause in a pig load statement.

      CQL Table

      CREATE table data (
        occurday  text,
        seqnumber int,
        occurtimems bigint,
        unique bigint,
      
        fields map<text, text>,
      
        primary key ((occurday, seqnumber), occurtimems, unique)
      )
      

      Pig Load statement Query

      data = LOAD 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27' USING CqlStorage();    
      

      This results in an exception when processed by the the CqlPagingRecordReader which attempts to page this query even though it contains at most one partition key. This leads to an invalid CQL statement.

      CqlPagingRecordReader Query

      SELECT * FROM "data" WHERE token("occurday","seqnumber") > ? AND
      token("occurday","seqnumber") <= ? AND occurday='A Great Day' 
      AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
      

      Exception

       InvalidRequestException(why:occurday cannot be restricted by more than one relation if it includes an Equal)
      

      I'm not sure it is worth the special case but, a modification to not use the paging record reader when the entire partition key is specified would solve this issue.

      Solution

      If it have EQUAL clauses for all the partitioning keys, we use Query

        SELECT * FROM "data" 
        WHERE occurday='A Great Day' 
             AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
      

      instead of

        SELECT * FROM "data" 
        WHERE token("occurday","seqnumber") > ? 
         AND token("occurday","seqnumber") <= ? 
         AND occurday='A Great Day' 
         AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
      

      The base line implementation is to retrieve all data of all rows around the ring. This new feature is to retrieve all data of a wide row. It's a one level lower than the base line. It helps for the use case where user is only interested in a specific wide row, so the user doesn't spend whole job to retrieve all the rows around the ring.

      Attachments

        1. 6151-v4-1.2.10-branch.txt
          3 kB
          Shridhar
        2. 6151-v3-1.2-branch.txt
          33 kB
          Alex Liu
        3. 6151-v2-1.2-branch.txt
          26 kB
          Alex Liu
        4. 6151-1.2-branch.txt
          25 kB
          Alex Liu

        Issue Links

          Activity

            People

              alexliu68 Alex Liu
              rspitzer Russell Spitzer
              Alex Liu
              Jonathan Ellis
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: