Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2930

Protocol-okhttp: implement IP filter

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Implemented
    • 1.18
    • 1.19
    • plugin, protocol, security
    • None
    • Patch Available

    Description

      In order to avoid information leakage to a public search index or web archive, it should be possible to configure Nutch in a way that no content is fetched from localhost, loop-back addresses, private address spaces.

      NUTCH-2527 adds the configuration snippets to exclude URLs pointing to private addresses.

      However, filtering URLs isn't enough because a DNS entry of an arbitrary host name may point to a private IP address. Blocking must happen on the protocol level because the IP address is only know in the protocol implementation. I'll add an implementation for protocol-okhttp.

      Attachments

        Issue Links

          Activity

            People

              snagel Sebastian Nagel
              snagel Sebastian Nagel
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: