Details
Description
An exhaustive test to check the matrix of CrawlDatum state transitions (CrawlStatus in 2.x) would be useful to detect errors esp. for continuous crawls where the number of possible transitions is quite large. Additional factors with impact on state transitions (retry counters, static and dynamic intervals) are also tested.
The tests will help to address the NUTCH-578 and NUTCH-1245. See the latter for a first sketchy patch.
Attachments
Attachments
Issue Links
- relates to
-
NUTCH-1564 AdaptiveFetchSchedule: sync_delta forces immediate refetch for documents not modified
- Open
-
NUTCH-1422 bypass signature comparison when a document is redirected
- Closed