Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-667

Input Format for working with Content in Hadoop Streaming

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.0.0
    • 1.0.0
    • None
    • None
    • All

    Description

      This is a ContextAsText input format that removes line endings with spaces that allow Nutch content to be used more effectively inside of Hadoop streaming jobs that allow MapReduce jobs to be written in any language that can communicate with stdin and stdout.

      Attachments

        1. NUTCH-667-1-20081126.patch
          3 kB
          Dennis Kubes

        Activity

          People

            musepwizard Dennis Kubes
            musepwizard Dennis Kubes
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: