Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2046

The crawl script should be able to skip an initial injection.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.10
    • 1.14
    • crawldb, injector

    Description

      When our crawl gets really big a new injection takes considerable time as it updates crawldb, the crawl script should be able to skip the injection and go directly to the generate call.

      Attachments

        1. crawl.patch
          1 kB
          Luis Lopez

        Issue Links

          Activity

            People

              jnioche Julien Nioche
              betolink Luis Lopez
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: