Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-266

hadoop bug when doing updatedb

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8
    • 0.8.1, 0.9.0
    • None
    • None
    • windows xp, JDK 1.4.2_04

    Description

      I constantly get the following error message

      060508 230637 Running job: job_pbhn3t
      060508 230637 c:/nutch/crawl-20060508230625/crawldb/current/part-00000/data:0+245
      060508 230637 c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_fetch/part-00000/data:0+296
      060508 230637 c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_parse/part-00000:0+5258
      060508 230637 job_pbhn3t
      java.io.IOException: Target /tmp/hadoop/mapred/local/reduce_qnd5sx/map_qjp7tf.out already exists
      at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:162)
      at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:62)
      at org.apache.hadoop.fs.LocalFileSystem.renameRaw(LocalFileSystem.java:191)
      at org.apache.hadoop.fs.FileSystem.rename(FileSystem.java:306)
      at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:101)
      Exception in thread "main" java.io.IOException: Job failed!
      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:341)
      at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:54)
      at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)

      Attachments

        1. patch_hadoop-0.5.0.diff
          1 kB
          Renaud Richardet
        2. patch.diff
          2 kB
          Renaud Richardet

        Activity

          People

            Unassigned Unassigned
            eugen Eugen Kochuev
            Votes:
            2 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: