Description
Once a SEVERE log item is written, Nutch shuts down any fetching forevermore. This is from the run() method in Fetcher.java:
public void run() {
synchronized (Fetcher.this)
// count threads
try {
UTF8 key = new UTF8();
CrawlDatum datum = new CrawlDatum();
while (true) {
if (LogFormatter.hasLoggedSevere()) // something bad happened
break; // exit
Notice the last 2 lines. This will prevent Nutch from ever Fetching again once this is hit as LogFormatter is storing this data as a static.
(Also note that "LogFormatter.hasLoggedSevere()" is also checked in org.apache.nutch.net.URLFilterChecker and will disable this class as well.)
This must be fixed or Nutch cannot be run as any kind of long-running service. Furthermore, I believe it is a poor decision to rely on a logging event to determine the state of the application - this could have any number of side-effects that would be extremely difficult to track down. (As it has already for me.)