Details
Description
Although Nutch does not support continuous crawling out of the box, and yes this is somehow doable using cron or even sometimes irrelevant due the size of the crawl its a nice feature to have.
This patch basically just adds a new parameter option to the bin/crawl script (w|-wait) which adds a time to wait if the generator returns 0 (when no URLs are scheduled for fetching).
This new parameter has the NUMBER[SUFFIX] format, if no suffix is provided the amount of time is assumed to be in seconds. Other valid suffixes are:
s - second
m - minutes
h - hours
d - days
If a -1 value is passed to the parameter or its not used at all the default behaviour of exciting the script is used.