Details
-
Improvement
-
Status: Closed
-
Trivial
-
Resolution: Won't Fix
-
0.7.2, 0.8
-
None
-
None
Description
DNS-to-IP mapping may change during long crawls, by default JVM 1.4 caches it forever.
Some related discussions at Jakarta-HttpClient-User
http://mail-archives.apache.org/mod_mbox/jakarta-httpclient-user/200506.mbox/%3c20050627022440.SVIL13442.lakermmtao05.cox.net@zeus%3e
http://java.sun.com/j2se/1.4.2/docs/guide/net/properties.html
networkaddress.cache.ttl (default: -1)
Specified in java.security to indicate the caching policy for successful name lookups from the name service.. The value is specified as as integer to indicate the number of seconds to cache the successful lookup.
A value of -1 indicates "cache forever".
We probably need this code in org.apache.nutch.fetcher.Fetcher:
private static final int FETCHER_DNS_TTL_MINUTES =
NutchConf.get().getInt("fetcher.dns.ttl.minutes", 120);
static
{ java.security.Security.setProperty("networkaddress.cache.ttl", "" + FETCHER_DNS_TTL_MINUTES*60); }And, new property in nutch-default.xml:
<property>
<name>fetcher.dns.ttl.minutes</name>
<value>120</value>
<description>DNS-to-IP cache, Time-to-Live</description>
</property>