Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-492

java.lang.OutOfMemoryError while indexing.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 0.9.0
    • None
    • indexer
    • None

    Description

      I'm getting this:

      java.lang.OutOfMemoryError: Java heap space
      at java.util.Arrays.copyOf(Arrays.java:2786)
      at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
      at java.io.DataOutputStream.write(DataOutputStream.java:90)
      at org.apache.hadoop.io.Text.writeString(Text.java:399)
      at org.apache.nutch.metadata.Metadata.write(Metadata.java:225)
      at org.apache.nutch.parse.ParseData.write(ParseData.java:165)
      at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:154)
      at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:65)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:315)
      at org.apache.nutch.indexer.Indexer.map(Indexer.java:306)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
      at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:126)
      2007-05-26 11:07:22,517 FATAL indexer.Indexer - Indexer: java.io.IOException: Job failed!
      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
      at org.apache.nutch.indexer.Indexer.index(Indexer.java:273)
      at org.apache.nutch.indexer.Indexer.run(Indexer.java:295)
      at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
      at org.apache.nutch.indexer.Indexer.main(Indexer.java:278)

      Something weird I'm seeing in hadoop.log is that the plugins are loaded again and again. I've created a custom plugin (if that can be causing something). According to the code a nre plugin repository is created for each "configuration object". I'm sure I'm not modifying the configuration object in any part of my code (I've checked).

      Why are the plugins loaded again and again and again until the heap is full?

      Attachments

        Activity

          People

            dogacan Dogacan Guney
            niqueco Nicolás Lichtmaier
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: