[MAPREDUCE-5410] MapReduce output issue - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.0.3
Fix Version/s: None
Component/s: examples, job submission
Labels:
None
Environment:

ubuntu

Description

Hi,

I am new to Hadoop concepts.
While practicing with one custom MapReduce program, I found the result is not as expected after executing the code on HDFS based file. Please note that when I execute the same program using Unix based file,getting expected result.
Below are the details of my code.

MapReduce in java
==================

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.*;

public class WordCount1 {

public static class Map extends MapReduceBase implements Mapper {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException {
String line = value.toString();
String tokenedZone=null;
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens())

{ tokenedZone=tokenizer.nextToken(); word.set(tokenedZone); output.collect(word, one); }

}
}

public static class Reduce extends MapReduceBase implements Reducer {
public void reduce(Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException {
int sum = 0;
int val = 0;
while (values.hasNext())

{ val = values.next().get(); sum += val; }

if(sum>1)
output.collect(key, new IntWritable(sum));
}
}

public static void main(String[] args) throws Exception

{ JobConf conf = new JobConf(); conf.setJarByClass(WordCount1.class); conf.setJobName("wordcount1"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); Path inPath = new Path(args[0]); Path outPath = new Path(args[0]); FileInputFormat.setInputPaths(conf,inPath ); FileOutputFormat.setOutputPath(conf, outPath); JobClient.runJob(conf); }

}

input File
===========
test my program
during test and my hadoop
your during
get program

hadoop generated output file on HDFS file system
=======================================
during 2
my 2
test 2

hadoop generated output file on local file system
=======================================
during 2
my 2
program 2
test 2

Please help me on this issue

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Mullangi

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 23/Jul/13 08:29

Updated:: 23/Jul/13 08:29