Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-2981

Provide Hadoop read access to Counter Columns.

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Fix Version/s: 0.8.3
    • Component/s: None
    • Labels:
      None

      Description

      o.a.c.Hadoop.ColumnFamilyRecordReader does not test for counter columns, which are different objects in the ColumnOrSuperColumn struct. Currently it raises an error as it thinks it's a super column

      2011-07-26 17:23:34,376 ERROR CliDriver (SessionState.java:printError(343)) - Failed with exception java.io.IOException:java.lang.NullPointerException
      java.io.IOException: java.lang.NullPointerException
      	at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:341)
      	at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:133)
      	at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1114)
      	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
      	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
      	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
      Caused by: java.lang.NullPointerException
      	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.unthriftifySuper(ColumnFamilyRecordReader.java:303)
      	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.unthriftify(ColumnFamilyRecordReader.java:297)
      	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:288)
      	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:177)
      	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
      	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
      	at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:136)
      	at org.apache.hadoop.hive.cassandra.input.HiveCassandraStandardColumnInputFormat$2.next(HiveCassandraStandardColumnInputFormat.java:153)
      	at org.apache.hadoop.hive.cassandra.input.HiveCassandraStandardColumnInputFormat$2.next(HiveCassandraStandardColumnInputFormat.java:111)
      	at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:326)
      	... 10 more
      

      My plan is to return o.a.c.db.CounterColumn objects just like the o.a.c.db.Column and SuperColumn that are returned.

        Activity

        Hide
        amorton amorton added a comment -

        0001-2981-hadoop-counters-input.patch modifies the CFRR to turn CounterColumns returned through the thrift API into o.a.c.db.Column instances.

        Could not use the CounterColumn as the CounterContext needs to read the node ID, and this requires the StorageService to be running and access to cassandra.yaml.

        It's not great, but the full CounterColumn should not be needed as Hadoop is read only access. Let me know it's too hacky.

        Also added another test to the hadoop_word_count example that sums the counter columns in a row.

        Show
        amorton amorton added a comment - 0001-2981-hadoop-counters-input.patch modifies the CFRR to turn CounterColumns returned through the thrift API into o.a.c.db.Column instances. Could not use the CounterColumn as the CounterContext needs to read the node ID, and this requires the StorageService to be running and access to cassandra.yaml. It's not great, but the full CounterColumn should not be needed as Hadoop is read only access. Let me know it's too hacky. Also added another test to the hadoop_word_count example that sums the counter columns in a row.
        Hide
        amorton amorton added a comment -

        As discussed in IRC, patch to read Counter Columns through CFRR. They should now be available via Brisk.

        Show
        amorton amorton added a comment - As discussed in IRC, patch to read Counter Columns through CFRR. They should now be available via Brisk.
        Hide
        jbellis Jonathan Ellis added a comment -

        committed, thanks!

        Show
        jbellis Jonathan Ellis added a comment - committed, thanks!
        Hide
        hudson Hudson added a comment -

        Integrated in Cassandra-0.8 #248 (See https://builds.apache.org/job/Cassandra-0.8/248/)
        add counter support to Hadoop InputFormat
        patch by Aaron Morton; reviewed by jbellis for CASSANDRA-2981

        jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152782
        Files :

        • /cassandra/branches/cassandra-0.8/CHANGES.txt
        • /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java
        • /cassandra/branches/cassandra-0.8/examples/hadoop_word_count/bin/word_count_counters
        • /cassandra/branches/cassandra-0.8/examples/hadoop_word_count/src/WordCountCounters.java
        • /cassandra/branches/cassandra-0.8/examples/hadoop_word_count/README.txt
        • /cassandra/branches/cassandra-0.8/examples/hadoop_word_count/src/WordCountSetup.java
        Show
        hudson Hudson added a comment - Integrated in Cassandra-0.8 #248 (See https://builds.apache.org/job/Cassandra-0.8/248/ ) add counter support to Hadoop InputFormat patch by Aaron Morton; reviewed by jbellis for CASSANDRA-2981 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152782 Files : /cassandra/branches/cassandra-0.8/CHANGES.txt /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java /cassandra/branches/cassandra-0.8/examples/hadoop_word_count/bin/word_count_counters /cassandra/branches/cassandra-0.8/examples/hadoop_word_count/src/WordCountCounters.java /cassandra/branches/cassandra-0.8/examples/hadoop_word_count/README.txt /cassandra/branches/cassandra-0.8/examples/hadoop_word_count/src/WordCountSetup.java

          People

          • Assignee:
            amorton amorton
            Reporter:
            amorton amorton
            Reviewer:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development