Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
ManifoldCF 1.2
-
None
Description
While datum is a nightmare (not all connectors get their dates in the same manner, same way, etc etc etc) it might be interesting to leverage the crawling to date some volatile media (such as web).
In case of webcrawling there are 3 dates that can certainly be inferred from the crawler's activity:
- Date of page first appeared in queue (somewhat loosely equivalent to a created date)
- Date of last checked by the crawler (might not reflect a version update, content could still be exactly the same)
- Date of last update (since the URL exists in the queue, it might have changed over time and the crawler m ight know about this).