Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2046

The crawl script should be able to skip an initial injection.

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.10
    • Fix Version/s: 1.14
    • Component/s: crawldb, injector
    • Labels:

      Description

      When our crawl gets really big a new injection takes considerable time as it updates crawldb, the crawl script should be able to skip the injection and go directly to the generate call.

        Attachments

        1. crawl.patch
          1 kB
          Luis Lopez

          Issue Links

            Activity

              People

              • Assignee:
                jnioche Julien Nioche
                Reporter:
                betolink Luis Lopez
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: