Description
For Nutch to fetch pages with basic authentication, the HttpClient should be configured with the username and password credentials.
For this to work:
1. Add the username and password credentials to nutch-site.xml as below:
<property>
<name>http.auth.basic.username</name>
<value>myusername</value>
<description>
username for http basic auth
</description>
</property>
<property>
<name>http.auth.basic.password</name>
<value>mypassword</value>
<description>
password for http basic auth
</description>
</property>
2. Configure httpclient with these credentials by applying the attached patch to nutch/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java