Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2155

Create a "crawl completeness" utility

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.10
    • 1.11
    • util

    Description

      I've found it useful to have a tool for dumping some "completeness" information from a crawl similar to how domainstats does but including fetched and unfetched counts per domain/host. This is especially nice when doing vertical crawls over a few domains or just to see how much of a host/domain you've covered with your crawl so far.

      Attachments

        1. NUTCH-2155_joyce_9Nov2015.patch
          2 kB
          Michael Joyce

        Activity

          People

            mjoyce Michael Joyce
            mjoyce Michael Joyce
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: