Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1408

RobotRulesParser main doesn't take URL's

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.5
    • Fix Version/s: 1.6
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      lib-http's org.apache.nutch.protocol.http.api.RobotRulesParser main() takes a robot file and an URL file according to its usage output. It, however, expects URI paths not URL's and will therefore never work if an input contains URL's.

        Attachments

        1. NUTCH-1408-1.6-1.patch
          0.9 kB
          Markus Jelsma

          Activity

            People

            • Assignee:
              markus17 Markus Jelsma
              Reporter:
              markus17 Markus Jelsma
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: