Details
Description
With http.agent.rotate == true and a one-element agent name list, the following exception is thrown:
% cat .../conf/agents.txt my-test-crawler/Nutch-1.13 % .../bin/nutch parsechecker -Dhttp.agent.rotate=true http://nutch.apache.org/ ... Fetch failed with protocol status: exception(16), lastModified=0: java.lang.IllegalArgumentException: bound must be positive % cat .../logs/hadoop.log ... 2017-03-03 11:17:19,750 ERROR http.Http - Failed to get protocol output java.lang.IllegalArgumentException: bound must be positive at java.util.concurrent.ThreadLocalRandom.nextInt(ThreadLocalRandom.java:352) at org.apache.nutch.protocol.http.api.HttpBase.getUserAgent(HttpBase.java:379) at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:180) ...
Caused by
userAgentNames.get(ThreadLocalRandom.current().nextInt(userAgentNames.size()-1));
but nextInt(...) is defined as: "Returns a pseudorandom int value between zero (inclusive) and the specified bound (exclusive)."
Attachments
Issue Links
- links to