Nutch
  1. Nutch
  2. NUTCH-252

Launching a segread/readdb command kills any running nutch commands

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: 0.8
    • Fix Version/s: 0.8
    • Component/s: None
    • Labels:
      None
    • Environment:

      multi-box installation using DFS (1 jobtracker/namenode master, 10 tasktracker/datanode slaves)

      Description

      I use a simple script to conduct a whole-web crawl (generate, fetch, updatedb, and repeat until target depth reached). While this is running, I monitor the progress via the jobtracker's browser-based UI. Sometimes there's a fairly long pause after one mapreduce job completes and the next one gets launched, so I mistakenly assume that depth has been reached. I then launch a segread -list or readdb -stats command to summarize the results. Doing so apparently kills any active jobs with absolutely no warning in any of the logs, the console output, or the jobtracker's UI. The jobs just stop writing to the logs and any child processes disappear. Usually, the jobtracker and tasktrackers remain up and respond to subsequent commands.

        Activity

        Hide
        Andrzej Bialecki added a comment -

        This code has been moved to Hadoop (and fixed there).

        Show
        Andrzej Bialecki added a comment - This code has been moved to Hadoop (and fixed there).

          People

          • Assignee:
            Unassigned
            Reporter:
            Chris Schneider
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development