Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1090

LinkDb (invertlinks) should inform the user when it ignores internal links

    XMLWordPrintableJSON

    Details

    • Patch Info:
      Patch Available

      Description

      I used nutch to crawl sites on a single domain. After the crawl was complete I tried to build a LinkDb. The LinkDb was empty.
      It comes up that this happens because the invertlinks command ignores internal links to the same domain by default.

      Unfortunately the LinkDb class doesn't tell anything about that. So it was hard to find out why the LinkDb was empty.

      I suggest to add an information for the user when the invertlinks command is ignoring internal links.

        Attachments

        1. LinkDb.patch
          2 kB
          Marek Bachmann

          Activity

            People

            • Assignee:
              markus17 Markus Jelsma
              Reporter:
              telekoma Marek Bachmann
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: