Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2310

Protocol-Selenium does not support HTTPS protocol

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.12
    • 1.15
    • protocol

    Description

      The protocol-selenium and protocol-interactiveselenium plugins raise errors whenever there is a URL with the HTTPS protocol.

      From the source code for those plugins, we can see that HTTP is the only scheme currently accepted, which makes Nutch unable to crawl HTTPS sites with JS using Selenium Webdrivers.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jxihong Joey Hong
              Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 48h
                  48h
                  Remaining:
                  Remaining Estimate - 48h
                  48h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified