Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-634

Patch - Nutch - Hadoop 0.17.1

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.0.0
    • 1.0.0
    • None
    • None
    • Patch Available

    Description

      This is a patch so that Nutch can be used with Hadoop 0.17.0. The patch is located at http://pastie.org/212001

      The patch compiles and passes all current Nutch unit tests.

      I have tested that the crawler side of Nutch (i.e. inject, generate, fetch, parse, merge w/crawldb) definetly works, but have not tested the lucene indexing part. It might work, but it might not.

      NOTE - the two main bugs that had to be overcome were not noticed by any of the unit tests. The bugs only came up during actual testing. The bugs were:

      1. Changes to the Hadoop Iterator
      2. Addition of Serialization to MapReduce Framework

      Attachments

        1. diff
          38 kB
          Michael Gottesman
        2. hadoop-0.17.patch
          90 kB
          Andrzej Bialecki
        3. hadoop-0.17.patch
          44 kB
          Lincoln Ritter
        4. hadoop-0.17.patch
          42 kB
          Lincoln Ritter

        Activity

          People

            ab Andrzej Bialecki
            gottesmm Michael Gottesman
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: