I'm using nutch 1.12 in local mode and solr 4.10.3.
For some reason i have detected that nutch index document with "noindex" robots meta.
I have use nutch script for a complete cycle:
bin/crawl -i urls/ crawl/ -2
with this url:
After various testing the problem persist and aproximately 200 document with this robots meta are indexed incorrectly.
I have read the method configure in IndexerMapReduce.java class and it has a line for that property but for some reason it is not doing appropiately.
this.deleteRobotsNoIndex = job.getBoolean(INDEXER_DELETE_ROBOTS_NOINDEX,false); (line 97)