Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1477

NPE when injecting with DataFileAvroStore

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Auto Closed
    • Affects Version/s: 2.1
    • Fix Version/s: 2.5
    • Component/s: storage
    • Labels:
      None
    • Environment:

      Java 1.6.0_35

      Description

      Fresh installation of Nutch 2.1, configured to use DataFileAvroStore. Injection job throws NullPointerException, see below. No error when I switch to MemStore.

      java.lang.NullPointerException
      at org.apache.avro.io.BinaryEncoder.writeString(BinaryEncoder.java:133)
      at org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:176)
      at org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:171)
      at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:72)
      at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:89)
      at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62)
      at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:55)
      at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:245)
      at org.apache.gora.avro.store.DataFileAvroStore.put(DataFileAvroStore.java:54)
      at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:60)
      at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
      at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
      at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:185)
      at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:85)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
      at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

        Attachments

        1. gora-core-0.2.1.jar
          147 kB
          Lewis John McGibbney
        2. NUTCH-1477.patch
          6 kB
          Lewis John McGibbney
        3. webpage.avsc
          2 kB
          Alfonso Nishikawa
        4. webpage.avsc
          2 kB
          Alfonso Nishikawa
        5. webpage.avsc
          2 kB
          Lewis John McGibbney
        6. webpage.avsc
          2 kB
          Julien Nioche

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              mbaranczak Mike Baranczak

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment