Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2524

bin/crawl: fix check for HostDb in distributed mode

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.15
    • Fix Version/s: 1.15
    • Component/s: bin
    • Labels:
      None

      Description

      In crawl script you can find something likeĀ 
      if [[ -d "$CRAWL_PATH"/hostdb ]]; then
      echo "Processing sitemaps based on hosts in HostDB"
      __bin_nutch sitemap "$CRAWL_PATH"/crawldb -hostdb "$CRAWL_PATH"/hostdb -threads $NUM_THREADS
      fi

      if [[ -d "$CRAWL_PATH"/hostdb ]]; doesnt work for HDFS only for local mode.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                semyon.semyonov@mail.com Semyon Semyonov
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: