Description
When running the ./bin/nutch index command with the -dir <path/to/segment/dir> I noticed that if you have a segment directory that doesn't include crawl_* or parse_* data, that the indexer fails (correctly). However, the indexer should be more resilient in those cases - we can add a simple check to see if those dirs are present in the segment, and proceed if they are, otherwise, ignore that segment and print a message (and go to the other segments).
Attachments
Issue Links
- duplicates
-
NUTCH-1771 Solrindex fails if a segment is corrupted or incomplete
- Closed