Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 1.5
    • Component/s: None
    • Labels:
      None

      Description

      Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

        Activity

        Hide
        Markus Jelsma added a comment -

        Any idea on how to resolve this? Suggestions for code location and header value?

        Show
        Markus Jelsma added a comment - Any idea on how to resolve this? Suggestions for code location and header value?
        Hide
        Julien Nioche added a comment -

        code location - same as

        <property>
        <name>http.accept.language</name>
        <value>en-us,en-gb,en;q=0.7,*;q=0.3</value>
        <description>Value of the "Accept-Language" request header field.
        This allows selecting non-English language as default one to retrieve.
        It is a useful setting for search engines build for certain national group.
        </description>
        </property>

        ?

        Show
        Julien Nioche added a comment - code location - same as <property> <name>http.accept.language</name> <value>en-us,en-gb,en;q=0.7,*;q=0.3</value> <description>Value of the "Accept-Language" request header field. This allows selecting non-English language as default one to retrieve. It is a useful setting for search engines build for certain national group. </description> </property> ?
        Hide
        Markus Jelsma added a comment -

        Ah, yes, that should work out just fine. Thanks for pointing me to it!

        Show
        Markus Jelsma added a comment - Ah, yes, that should work out just fine. Thanks for pointing me to it!
        Hide
        Markus Jelsma added a comment -

        Patch for 1.5. A simple PHP script tells me it works as the Accept header is sent along with the rest:

        ["HTTP_ACCEPT_LANGUAGE"]=> string(28) "en-us,en-gb,en;q=0.7,*;q=0.3"
        ["HTTP_ACCEPT"]=> string(63) "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
        
        Show
        Markus Jelsma added a comment - Patch for 1.5. A simple PHP script tells me it works as the Accept header is sent along with the rest: [ "HTTP_ACCEPT_LANGUAGE" ]=> string(28) "en-us,en-gb,en;q=0.7,*;q=0.3" [ "HTTP_ACCEPT" ]=> string(63) "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
        Hide
        Lewis John McGibbney added a comment -

        Looks good to me Markus. +1

        Show
        Lewis John McGibbney added a comment - Looks good to me Markus. +1
        Hide
        Markus Jelsma added a comment -

        Committed for 1.5 in rev. 1301480.
        Thanks Lewis.

        Show
        Markus Jelsma added a comment - Committed for 1.5 in rev. 1301480. Thanks Lewis.
        Hide
        Hudson added a comment -

        Integrated in nutch-trunk-maven #198 (See https://builds.apache.org/job/nutch-trunk-maven/198/)
        NUTCH-1310 Nutch to send HTTP-accept header (Revision 1301480)

        Result = SUCCESS
        markus :
        Files :

        • /nutch/trunk/CHANGES.txt
        • /nutch/trunk/conf/nutch-default.xml
        • /nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
        • /nutch/trunk/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java
        Show
        Hudson added a comment - Integrated in nutch-trunk-maven #198 (See https://builds.apache.org/job/nutch-trunk-maven/198/ ) NUTCH-1310 Nutch to send HTTP-accept header (Revision 1301480) Result = SUCCESS markus : Files : /nutch/trunk/CHANGES.txt /nutch/trunk/conf/nutch-default.xml /nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java /nutch/trunk/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java
        Hide
        Hudson added a comment -

        Integrated in Nutch-trunk #1789 (See https://builds.apache.org/job/Nutch-trunk/1789/)
        NUTCH-1310 Nutch to send HTTP-accept header (Revision 1301480)

        Result = SUCCESS
        markus : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1301480
        Files :

        • /nutch/trunk/CHANGES.txt
        • /nutch/trunk/conf/nutch-default.xml
        • /nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
        • /nutch/trunk/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java
        Show
        Hudson added a comment - Integrated in Nutch-trunk #1789 (See https://builds.apache.org/job/Nutch-trunk/1789/ ) NUTCH-1310 Nutch to send HTTP-accept header (Revision 1301480) Result = SUCCESS markus : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1301480 Files : /nutch/trunk/CHANGES.txt /nutch/trunk/conf/nutch-default.xml /nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java /nutch/trunk/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java

          People

          • Assignee:
            Markus Jelsma
            Reporter:
            Markus Jelsma
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development