Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1408

RobotRulesParser main doesn't take URL's

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.5
    • 1.6
    • None
    • None
    • Patch Available

    Description

      lib-http's org.apache.nutch.protocol.http.api.RobotRulesParser main() takes a robot file and an URL file according to its usage output. It, however, expects URI paths not URL's and will therefore never work if an input contains URL's.

      Attachments

        1. NUTCH-1408-1.6-1.patch
          0.9 kB
          Markus Jelsma

        Activity

          People

            markus17 Markus Jelsma
            markus17 Markus Jelsma
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: