Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-7407

COPY command does not work properly with collections causing failure to import data

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Normal
    • Resolution: Fixed
    • Fix Version/s: 1.2.17, 2.0.9, 2.1 rc2
    • Component/s: None
    • Labels:
      None
    • Environment:

      cqlsh 4.1.1,
      Cassandra 2.0.7.31,
      CQL spec 3.1.1,
      Thrift protocol 19.39.0

    • Severity:
      Normal

      Description

      The COPY command does not properly format collections in the output CSV - to be able to re-import the data.

      Here is how you can replicate the problem:

      CREATE TABLE user_colors ( 
      user_id int PRIMARY KEY, 
      colors list<ascii> 
      );
      
      UPDATE user_colors SET colors = ['red','blue'] WHERE user_id=5; 
      UPDATE user_colors SET colors = ['purple','yellow'] WHERE user_id=6; 
      UPDATE user_colors SET colors = ['black''] WHERE user_id=7;
      
      COPY user_colors (user_id, colors) TO 'output.csv';
      
      CREATE TABLE user_colors2 ( 
      user_id int PRIMARY KEY, 
      colors list<ascii> 
      );
      
      COPY user_colors2 (user_id, colors ) FROM 'user_colors.csv';
      Bad Request: line 1:68 no viable alternative at input ']'
      Aborting import at record #0 (line 1). Previously-inserted values still present.
      0 rows imported in 0.007 seconds.
      

      The CSV file seems to be malformed

      • The single quotes within the collection are missing
      • The double quotes for collection on user_id=7 are missing and causing COPY to fail.
        5,"[red, blue]"
        7,[black]
        6,"[purple, yellow]"
        

      Should be like this

      5,"['red', 'blue']"
      7,"['black']"
      6,"['purple', 'yellow']"
      

      Once the file is changed, the import works

      COPY user_colors2 (user_id, colors ) FROM 'user_colors.csv';
      3 rows imported in 0.012 seconds.
      SELECT * FROM user_colors2;
      
       user_id | colors
      ---------+------------------
             5 |      [red, blue]
             7 |          [black]
             6 | [purple, yellow]
      
      (3 rows)
      

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              mishail Mikhail Stepura Assign to me
              Reporter:
              jpoblete Jose Martinez Poblete
              Authors:
              Mikhail Stepura
              Reviewers:
              Brandon Williams

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment