Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-634

Patch - Nutch - Hadoop 0.17.1

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.0.0
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      This is a patch so that Nutch can be used with Hadoop 0.17.0. The patch is located at http://pastie.org/212001

      The patch compiles and passes all current Nutch unit tests.

      I have tested that the crawler side of Nutch (i.e. inject, generate, fetch, parse, merge w/crawldb) definetly works, but have not tested the lucene indexing part. It might work, but it might not.

      NOTE - the two main bugs that had to be overcome were not noticed by any of the unit tests. The bugs only came up during actual testing. The bugs were:

      1. Changes to the Hadoop Iterator
      2. Addition of Serialization to MapReduce Framework

        Attachments

        1. diff
          38 kB
          Michael Gottesman
        2. hadoop-0.17.patch
          42 kB
          Lincoln Ritter
        3. hadoop-0.17.patch
          44 kB
          Lincoln Ritter
        4. hadoop-0.17.patch
          90 kB
          Andrzej Bialecki

          Activity

            People

            • Assignee:
              ab Andrzej Bialecki
              Reporter:
              gottesmm Michael Gottesman
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: