|
[Excerpt from maillist, sender: Andrzej Bialecki]
When a page is redirected, the original url is NOT updated - so, CrawlDB will never know that a redirect occured, it won't even know that a fetch occured... This looks like a bug.
In 0.7 this was recorded in the segment, and then it would affect the Page status during updatedb. It should do so 0.8, too...
|