Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-3371

Cassandra inferred schema and actual data don't match

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Normal
    • Resolution: Fixed
    • Fix Version/s: 1.0.8
    • Component/s: None
    • Labels:
      None
    • Severity:
      Normal

      Description

      It's looking like there may be a mismatch between the schema that's being reported by the latest CassandraStorage.java, and the data that's actually returned. Here's an example:

      rows = LOAD 'cassandra://Frap/PhotoVotes' USING CassandraStorage();
      DESCRIBE rows;
      rows: {key: chararray,columns: {(name: chararray,value: bytearray,photo_owner: chararray,value_photo_owner: bytearray,pid: chararray,value_pid: bytearray,matched_string: chararray,value_matched_string: bytearray,src_big: chararray,value_src_big: bytearray,time: chararray,value_time: bytearray,vote_type: chararray,value_vote_type: bytearray,voter: chararray,value_voter: bytearray)}}
      DUMP rows;
      (691831038_1317937188.48955,

      {(photo_owner,1596090180),(pid,6855155124568798560),(matched_string,),(src_big,),(time,Thu Oct 06 14:39:48 -0700 2011),(vote_type,album_dislike),(voter,691831038)}

      )

      getSchema() is reporting the columns as an inner bag of tuples, each of which contains 16 values. In fact, getNext() seems to return an inner bag containing 7 tuples, each of which contains two values.

      It appears that things got out of sync with this change:
      http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java?r1=1177083&r2=1177082&pathrev=1177083

      See more discussion at:
      http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/pig-cassandra-problem-quot-Incompatible-field-schema-quot-error-tc6882703.html

        Attachments

        1. smoke_test.txt
          4 kB
          Brandon Williams
        2. pig.diff
          7 kB
          Pete Warden
        3. 3371-v6-cleanup.patch
          7 kB
          Pavel Yaskevich
        4. 3371-v6.txt
          17 kB
          Brandon Williams
        5. 3371-v5-rebased.txt
          8 kB
          Brandon Williams
        6. 3371-v5.txt
          8 kB
          Brandon Williams
        7. 3371-v4.txt
          7 kB
          Brandon Williams
        8. 3371-v3.txt
          9 kB
          Brandon Williams
        9. 3371-v2.txt
          5 kB
          Brandon Williams
        10. 0002-Output-support-to-match-input.txt
          10 kB
          Brandon Williams
        11. 0001-Rework-pig-schema.txt
          9 kB
          Brandon Williams

          Issue Links

            Activity

              People

              • Assignee:
                brandon.williams Brandon Williams
                Reporter:
                petewarden Pete Warden
                Authors:
                Brandon Williams
                Reviewers:
                Pavel Yaskevich
              • Votes:
                2 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: