Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-259

Problem in IndexSorter after dedup

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Won't Fix
    • None
    • None
    • indexer
    • None

    Description

      When trying to run IndexSorter i'm getting an error:

      Exception in thread "main" java.lang.IllegalArgumentException: attempt to access a deleted document
      at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:282)
      at org.apache.lucene.index.FilterIndexReader.document(FilterIndexReader.java:104)
      at org.apache.nutch.indexer.IndexSorter$SortingReader.document(IndexSorter.java:170)
      at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:186)
      at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:88)
      at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:579)
      at org.apache.nutch.indexer.IndexSorter.sort(IndexSorter.java:240)
      at org.apache.nutch.indexer.IndexSorter.main(IndexSorter.java:291)

      Attachments

        Activity

          People

            Unassigned Unassigned
            yalta Michael
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: