Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2442

Injector to stop if job fails to avoid loss of CrawlDb

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 1.13
    • 1.14
    • injector
    • None
    • Patch Available

    Description

      Injector does not check whether the MapReduce job is successful. Even if the job fails

      • installs the CrawlDb
        • move current/ to old/
        • replace current/ with an empty or potentially incomplete version
      • exits with code 0 so that scripts running the crawl workflow cannot detect the failure – if Injector is run a second time the CrawlDb is lost (both current/ and old/ are empty or corrupted)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              snagel Sebastian Nagel
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: