Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-258

Once Nutch logs a SEVERE log item, Nutch fails forevermore

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Cannot Reproduce
    • 0.8
    • 0.9.0
    • fetcher
    • None
    • All

    Description

      Once a SEVERE log item is written, Nutch shuts down any fetching forevermore. This is from the run() method in Fetcher.java:

      public void run() {
      synchronized (Fetcher.this)

      {activeThreads++;}

      // count threads

      try {
      UTF8 key = new UTF8();
      CrawlDatum datum = new CrawlDatum();

      while (true) {
      if (LogFormatter.hasLoggedSevere()) // something bad happened
      break; // exit

      Notice the last 2 lines. This will prevent Nutch from ever Fetching again once this is hit as LogFormatter is storing this data as a static.

      (Also note that "LogFormatter.hasLoggedSevere()" is also checked in org.apache.nutch.net.URLFilterChecker and will disable this class as well.)

      This must be fixed or Nutch cannot be run as any kind of long-running service. Furthermore, I believe it is a poor decision to rely on a logging event to determine the state of the application - this could have any number of side-effects that would be extremely difficult to track down. (As it has already for me.)

      Attachments

        1. dumbfix.patch
          0.7 kB
          Stefan Neufeind
        2. NUTCH-258.Mattmann.060906.patch.txt
          20 kB
          Chris A. Mattmann
        3. NUTCH-258.Mattmann.080406.patch.txt
          9 kB
          Chris A. Mattmann

        Issue Links

          Activity

            People

              chrismattmann Chris A. Mattmann
              scottganyo Scott Ganyo
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: