Cassandra
  1. Cassandra
  2. CASSANDRA-5488

CassandraStorage throws NullPointerException (NPE) when widerows is set to 'true'

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.1.12, 1.2.6
    • Component/s: Hadoop
    • Labels:
    • Environment:

      Ubuntu 12.04.1 x64, Cassandra 1.2.4

      Description

      CassandraStorage throws NPE when widerows is set to 'true'.

      2 problems in getNextWide:
      1. Creation of tuple without specifying size
      2. Calling addKeyToTuple on lastKey instead of key

      java.lang.NullPointerException
      at org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167)
      at org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124)
      at org.apache.cassandra.cql.jdbc.JdbcUTF8.getString(JdbcUTF8.java:73)
      at org.apache.cassandra.cql.jdbc.JdbcUTF8.compose(JdbcUTF8.java:93)
      at org.apache.cassandra.db.marshal.UTF8Type.compose(UTF8Type.java:34)
      at org.apache.cassandra.db.marshal.UTF8Type.compose(UTF8Type.java:26)
      at org.apache.cassandra.hadoop.pig.CassandraStorage.addKeyToTuple(CassandraStorage.java:313)
      at org.apache.cassandra.hadoop.pig.CassandraStorage.getNextWide(CassandraStorage.java:196)
      at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(CassandraStorage.java:224)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:194)
      at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
      at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
      at org.apache.hadoop.mapred.Child.main(Child.java:249)
      2013-04-16 12:28:03,671 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

      1. 5488.txt
        3 kB
        Sheetal Gosrani
      2. 5488-2.txt
        4 kB
        Jeremy Hanna

        Activity

        Hide
        Sheetal Gosrani added a comment -

        This patch (5488.txt) fixes the issue.

        Show
        Sheetal Gosrani added a comment - This patch (5488.txt) fixes the issue.
        Hide
        Brandon Williams added a comment -

        Can you add a test to examples/pig/test/test_storage.pig that demonstrates the problem?

        Show
        Brandon Williams added a comment - Can you add a test to examples/pig/test/test_storage.pig that demonstrates the problem?
        Hide
        Jeremy Hanna added a comment -

        I've reproduced this with 1.1.9 as well.

        Show
        Jeremy Hanna added a comment - I've reproduced this with 1.1.9 as well.
        Hide
        Jeremy Hanna added a comment -

        Looks like it's from CASSANDRA-5098

        Show
        Jeremy Hanna added a comment - Looks like it's from CASSANDRA-5098
        Hide
        Jeremy Hanna added a comment -

        An alternative way to do it with consolidating the two methods and checking for null in that method.

        Show
        Jeremy Hanna added a comment - An alternative way to do it with consolidating the two methods and checking for null in that method.
        Hide
        Brandon Williams added a comment -

        Committed v2, and also flipped the copy test to use widerow mode as a smoke test.

        Show
        Brandon Williams added a comment - Committed v2, and also flipped the copy test to use widerow mode as a smoke test.
        Hide
        Jeremy Hanna added a comment -

        There ended up being a secondary problem that was hidden by the first NPE. It seems to be related to getting the AbstractType. The NPE was for this line: https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java#L307 which I decomposed to find out what it was NPEing on, and got this:

                    List<AbstractType> atList = getDefaultMarshallers(cfDef);
                    AbstractType at = atList.get(2);
                    Object o = at.compose(key); //NPE from this line
                    setTupleValue(tuple, 0, o);
                    //setTupleValue(tuple, 0, getDefaultMarshallers(cfDef).get(2).compose(key));
        

        So it seems unrelated to the original NPE, but still matches the description of this ticket.

        To reproduce, here is my schema:

        CREATE KEYSPACE circus
        with placement_strategy = 'SimpleStrategy'
        and strategy_options = {replication_factor:1};
        
        use circus;
        
        CREATE COLUMN FAMILY acrobats
        WITH comparator = UTF8Type
        AND key_validation_class=UTF8Type
        AND default_validation_class = UTF8Type;
        

        Here is a pycassa script to create the data:

        from pycassa.pool import ConnectionPool
        from pycassa.columnfamily import ColumnFamily
        
        pool = ConnectionPool('circus')
        col_fam = pycassa.ColumnFamily(pool, 'acrobats')
        
        for i in range(1, 10):
            for j in range(1, 200000):
                col_fam.insert('row_key' + str(i), {str(j): 'val'})
        

        Here is the pig (0.9.2) that I'm running in local mode:

        rows = LOAD 'cassandra://circus/acrobats?widerows=true&limit=200000' USING CassandraStorage();
        filtered = filter rows by key == 'row_key1';
        columns = foreach filtered generate flatten(columns);
        counted = foreach (group columns all) generate COUNT($1);
        dump counted;
        
        Show
        Jeremy Hanna added a comment - There ended up being a secondary problem that was hidden by the first NPE. It seems to be related to getting the AbstractType. The NPE was for this line: https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java#L307 which I decomposed to find out what it was NPEing on, and got this: List<AbstractType> atList = getDefaultMarshallers(cfDef); AbstractType at = atList.get(2); Object o = at.compose(key); //NPE from this line setTupleValue(tuple, 0, o); //setTupleValue(tuple, 0, getDefaultMarshallers(cfDef).get(2).compose(key)); So it seems unrelated to the original NPE, but still matches the description of this ticket. To reproduce, here is my schema: CREATE KEYSPACE circus with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor:1}; use circus; CREATE COLUMN FAMILY acrobats WITH comparator = UTF8Type AND key_validation_class=UTF8Type AND default_validation_class = UTF8Type; Here is a pycassa script to create the data: from pycassa.pool import ConnectionPool from pycassa.columnfamily import ColumnFamily pool = ConnectionPool('circus') col_fam = pycassa.ColumnFamily(pool, 'acrobats') for i in range(1, 10): for j in range(1, 200000): col_fam.insert('row_key' + str(i), {str(j): 'val'}) Here is the pig (0.9.2) that I'm running in local mode: rows = LOAD 'cassandra: //circus/acrobats?widerows= true &limit=200000' USING CassandraStorage(); filtered = filter rows by key == 'row_key1'; columns = foreach filtered generate flatten(columns); counted = foreach (group columns all) generate COUNT($1); dump counted;
        Hide
        Brandon Williams added a comment -

        v2 was a little too aggressive in function consolidation. I reverted it and applied v1.

        Show
        Brandon Williams added a comment - v2 was a little too aggressive in function consolidation. I reverted it and applied v1.

          People

          • Assignee:
            Sheetal Gosrani
            Reporter:
            Sheetal Gosrani
            Reviewer:
            Brandon Williams
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development