Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1601

ElasticSearchIndexer fails to properly delete documents

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.7
    • 1.8
    • indexer
    • None
    • Patch Available

    Description

      Exception is thrown because the indexer does not properly set the type and index for delete commands. This comes from the original source so 2x may be affected as well.

      ava.io.IOException
              at org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.makeIOException(ElasticIndexWriter.java:173)
              at org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.delete(ElasticIndexWriter.java:168)
              at org.apache.nutch.indexer.IndexWriters.delete(IndexWriters.java:108)
              at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:52)
              at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41)
              at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write(ReduceTask.java:458)
              at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:500)
              at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:203)
              at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:53)
              at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:522)
              at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
              at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
      Caused by: org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: index is missing;2: type is missing;
              at org.elasticsearch.action.ValidateActions.addValidationError(ValidateActions.java:29)
              at org.elasticsearch.action.support.replication.ShardReplicationOperationRequest.validate(ShardReplicationOperationRequest.java:126)
              at org.elasticsearch.action.delete.DeleteRequest.validate(DeleteRequest.java:84)
              at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:55)
              at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:83)
              at org.elasticsearch.client.support.AbstractClient.delete(AbstractClient.java:121)
              at org.elasticsearch.action.delete.DeleteRequestBuilder.doExecute(DeleteRequestBuilder.java:147)
              at org.elasticsearch.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:53)
              at org.elasticsearch.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:47)
              at org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.delete(ElasticIndexWriter.java:165)
              ... 10 more
      2013-07-03 11:43:39,957 ERROR indexer.IndexingJob - Indexer: java.io.IOException: Job failed!
              at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
              at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:123)
              at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:185)
              at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
              at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:195) 
      

      Attachments

        1. NUTCH-1601-1.8.patch
          0.7 kB
          Markus Jelsma

        Activity

          People

            markus17 Markus Jelsma
            markus17 Markus Jelsma
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: