Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-876

Remove remaining robots/IP blocking code in lib-http

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • nutchgora
    • nutchgora
    • fetcher
    • None

    Description

      There are remains of the (very old) blocking code in lib-http/.../HttpBase.java. This code was used with the OldFetcher to manage politeness limits. New trunk doesn't have OldFetcher anymore, so this code is useless. Furthermore, there is an actual bug here - FetcherJob forgets to set Protocol.CHECK_BLOCKING and Protocol.CHECK_ROBOTS to false, and the defaults in lib-http are set to true.

      Attachments

        1. NUTCH-876.patch
          10 kB
          Andrzej Bialecki

        Activity

          People

            ab Andrzej Bialecki
            ab Andrzej Bialecki
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: