Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6827

Failed to traverse Iterable values the second time in reduce() method

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 3.0.0-alpha1
    • None
    • task
    • None
    • hadoop2.7.3

    Description

      Failed to traverse Iterable values the second time in reduce() method

      The following code is a reduce() method (of WordCount):

      WordCount.java
      	public static class WcReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
      
      		@Override
      		protected void reduce(Text key, Iterable<IntWritable> values, Context context)
      				throws IOException, InterruptedException {
      
      			// print some logs
      			List<String> vals = new LinkedList<>();
      			for(IntWritable i : values) {
      				vals.add(i.toString());
      			}
      			System.out.println(String.format(">>>> reduce(%s, [%s])",
      					key, String.join(", ", vals)));
      
      			// sum of values
      			int sum = 0;
      			for(IntWritable i : values) {
      				sum += i.get();
      			}
      			System.out.println(String.format(">>>> reduced(%s, %s)",
      					key, sum));
      			
      			context.write(key, new IntWritable(sum));
      		}			
      	}
      

      After running it, we got the result that all sums were zero!

      After debugging, it was found that the second foreach-loop was not executed, and the root cause was the returned value of Iterable.iterator(), it returned the same instance in the two calls called by foreach-loop. In general, Iterable.iterator() should return a new instance in each call, such as ArrayList.iterator().

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              javeme javaloveme
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: