Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2280

HTTP Post form authentication CookiePolicy configuration

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.11
    • Fix Version/s: None
    • Component/s: protocol
    • Labels:
    • External issue ID:
      827

      Description

      The protocol-httpclient plugin supports HTTP form authentication with form values post back to the assigned login URL and store the session cookie for following content retrieving.
      The httpclient default CookiePolicy setting is in use. This default setting will reject cookie has domain set starting as ".", for example domain=".domain.com". This kind of domain value could be accepted by most web browsers.
      I suggest to add an configurable option in conf/httpclient-auth.xml:

      <credentials authMethod="formMethod" ...>
      ...
        <loginCookie>
          <policy>DEFAULT | BROWSER_COMPATIBILITY | NETSCAPE RFC_2109 | RFC_2965</policy>
        </loginCookie>
      </credentials>

      Then, the httpclient could take this Cookie policy value.

      I am working on a patch for this feature. But before i implement the configuration format change, i would like to hear any other suggestions or comments.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                lewismc Lewis John McGibbney
                Reporter:
                stevegy Steve Yao
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 168h
                  168h
                  Remaining:
                  Remaining Estimate - 168h
                  168h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified