Uploaded image for project: 'Apache Gora'
  1. Apache Gora
  2. GORA-211

thread safety: java.lang.NullPointerException

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.2
    • Fix Version/s: 0.3
    • Component/s: gora-cassandra
    • Labels:
      None
    • Environment:
      nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / gora-core 0.2.1
      running fetch with parse=true
      fetcher.threads.per.queue=2

      nutch on a 16 core AMD Opteron 2GHz
      Cassandra on 8 core Intel Xeon 3.3 GHz

      Description

      This is the result of debugging one of my issues described in NUTCH-1534.

      example trace:
      java.lang.NullPointerException
      at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
      at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:71)
      at org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:139)
      at org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:307)
      at org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:212)
      at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
      at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587)
      at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
      at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664)
      at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534)

      I'm suspecting CassandraStore.put() not taking enough precautions to copy all objects safely to it's buffer.

              switch(type) {
                case RECORD:
                  Persistent persistent = (Persistent) fieldValue;
                  Persistent newRecord = persistent.newInstance(new StateManagerImpl());
                  for (Field member: fieldSchema.getFields()) {
                    newRecord.put(member.pos(), persistent.get(member.pos()));
                  }
                  fieldValue = newRecord;
                  break;
                case MAP:
                  StatefulHashMap<?, ?> map = (StatefulHashMap<?, ?>) fieldValue;
                  StatefulHashMap<?, ?> newMap = new StatefulHashMap(map);
                  fieldValue = newMap;
                  break;
              }
      

      case RECORD - do we not need to duplicate the object returned by "persistent.get(member.pos())":
      newRecord.put(member.pos(), persistent.get(member.pos()))

      case MAP - do we not need to duplicate all value-objects of the map?

      I had not time to write a patch or test this, so, please comment

        Attachments

        1. GORA-211-trunk-v3.patch
          2 kB
          Roland von Herget
        2. GORA-211-trunk-v2.patch
          3 kB
          Roland von Herget
        3. GORA-211-trunk.patch
          2 kB
          Roland von Herget
        4. GORA-211-0.2.patch
          4 kB
          Roland von Herget

          Issue Links

            Activity

              People

              • Assignee:
                rherget Roland von Herget
                Reporter:
                rherget Roland von Herget
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: