[MAPREDUCE-3319] multifilewc from hadoop examples seems to be broken in 0.20.205.0 - ASF JIRA

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: 0.20.205.0
Fix Version/s: 1.0.0
Component/s: examples
Labels:
- bigtop

Target Version/s:

1.0.0

Description

/usr/lib/hadoop/bin/hadoop jar /usr/lib/hadoop/hadoop-examples-0.20.205.0.22.jar multifilewc  examples/text examples-output/multifilewc
11/10/31 16:50:26 INFO mapred.FileInputFormat: Total input paths to process : 2
11/10/31 16:50:26 INFO mapred.JobClient: Running job: job_201110311350_0220
11/10/31 16:50:27 INFO mapred.JobClient:  map 0% reduce 0%
11/10/31 16:50:42 INFO mapred.JobClient: Task Id : attempt_201110311350_0220_m_000000_0, Status : FAILED
java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.LongWritable
	at org.apache.hadoop.mapred.lib.LongSumReducer.reduce(LongSumReducer.java:44)
	at org.apache.hadoop.mapred.Task$OldCombinerRunner.combine(Task.java:1431)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1436)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

MAPREDUCE-3319.patch
06/Dec/11 10:28
2 kB
Subroto Sanyal

Activity

Ascending order - Click to sort in descending order

Roman Shaposhnik added a comment - 24/Nov/11 01:46

I really think we should fix this before releasing any version that has a potential of becoming Hadoop 1.0. I'll try to see if I can cook up a patch to fix it. If not – I'd say it is better to disable this example, rather than shipping a broken one.

Agreed?

Roman Shaposhnik added a comment - 24/Nov/11 01:46 I really think we should fix this before releasing any version that has a potential of becoming Hadoop 1.0. I'll try to see if I can cook up a patch to fix it. If not – I'd say it is better to disable this example, rather than shipping a broken one. Agreed?

Matthew Foley added a comment - 28/Nov/11 09:21

Moved requested target versions from Fix Version to Target Version field.
Absent a viable patch in time for 1.0.0, this will have to go in 1.1.0.

Matthew Foley added a comment - 28/Nov/11 09:21 Moved requested target versions from Fix Version to Target Version field. Absent a viable patch in time for 1.0.0, this will have to go in 1.1.0.

Roman Shaposhnik added a comment - 28/Nov/11 21:07

Matt, what's the cut-off date for providing a patch? I'd really like to see this fixed in 1.0.

Roman Shaposhnik added a comment - 28/Nov/11 21:07 Matt, what's the cut-off date for providing a patch? I'd really like to see this fixed in 1.0.

Roman Shaposhnik added a comment - 28/Nov/11 21:49

Ok, so this appears to be fixed in branch-1.0 rev 1207581

Is this the branch from which 1.0.0 will be cut from?

Roman Shaposhnik added a comment - 28/Nov/11 21:49 Ok, so this appears to be fixed in branch-1.0 rev 1207581 Is this the branch from which 1.0.0 will be cut from?

Subroto Sanyal added a comment - 06/Dec/11 10:29

The patch makes the code comapatible with LongWritable as final Output value class

Subroto Sanyal added a comment - 06/Dec/11 10:29 The patch makes the code comapatible with LongWritable as final Output value class

Hadoop QA added a comment - 06/Dec/11 10:33

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12506241/MAPREDUCE-3319.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1399//console

This message is automatically generated.

Hadoop QA added a comment - 06/Dec/11 10:33 -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506241/MAPREDUCE-3319.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1399//console This message is automatically generated.

Subroto Sanyal added a comment - 07/Dec/11 03:18

The patch is prepared on branch:
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1

The patch resolves an issue in hadoop-example; so tests not included.

Subroto Sanyal added a comment - 07/Dec/11 03:18 The patch is prepared on branch: http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1 The patch resolves an issue in hadoop-example; so tests not included.

Matthew Foley added a comment - 09/Dec/11 23:13

Subroto, thanks for providing a patch. Unfortunately you also set the "Fix Version" to 1.0.0, which implies it is already fixed in 1.0.0, which it apparently isn't, according to Roman's email of today. (Although Roman's comment of 28/Nov/11 21:49 also seems to imply it was fixed in 1.0.0. Perhaps that was user error.)

Please do not set "Fix Version" until after the patch is committed to a given version. The intent or desire to have a fix in a given version should be expressed in "Target Version", not "Fix Version".

+1 for code review, lgtm. Committed to branch-1 for future release 1.1.0.
Thanks Subroto!

Matthew Foley added a comment - 09/Dec/11 23:13 Subroto, thanks for providing a patch. Unfortunately you also set the "Fix Version" to 1.0.0, which implies it is already fixed in 1.0.0, which it apparently isn't, according to Roman's email of today. (Although Roman's comment of 28/Nov/11 21:49 also seems to imply it was fixed in 1.0.0. Perhaps that was user error.) Please do not set "Fix Version" until after the patch is committed to a given version. The intent or desire to have a fix in a given version should be expressed in "Target Version", not "Fix Version". +1 for code review, lgtm. Committed to branch-1 for future release 1.1.0. Thanks Subroto!

Matthew Foley added a comment - 09/Dec/11 23:24

It appears this change is also applicable to trunk, file
hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/MultiFileWordCount.java

Subroto, please port your patch to trunk and post another patch.
Then I can resolve this jira. Thank you.

Matthew Foley added a comment - 09/Dec/11 23:24 It appears this change is also applicable to trunk, file hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/MultiFileWordCount.java Subroto, please port your patch to trunk and post another patch. Then I can resolve this jira. Thank you.

Subroto Sanyal added a comment - 12/Dec/11 05:12

Hi Matt,

I ran the multifilewc in trunk (mrv2) and it runs fine.
Output of the run:

2011-12-12 10:35:16,569 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1227))
-  map 100% reduce 100%
2011-12-12 10:35:16,574 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1238))
- Job job_1322715021135_0002 completed successfully
2011-12-12 10:35:16,743 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1245))
- Counters: 43
        File System Counters
                FILE: BYTES_READ=97054
                FILE: BYTES_WRITTEN=190175
                FILE: READ_OPS=0
                FILE: LARGE_READ_OPS=0
                FILE: WRITE_OPS=0
                HDFS: BYTES_READ=1165096553
                HDFS: BYTES_WRITTEN=4394
                HDFS: READ_OPS=32
                HDFS: LARGE_READ_OPS=0
                HDFS: WRITE_OPS=4
        org.apache.hadoop.mapreduce.JobCounter
                TOTAL_LAUNCHED_MAPS=1
                TOTAL_LAUNCHED_REDUCES=1
                OTHER_LOCAL_MAPS=1
                SLOTS_MILLIS_MAPS=67860
                SLOTS_MILLIS_REDUCES=65602
        org.apache.hadoop.mapreduce.TaskCounter
                MAP_INPUT_RECORDS=265824
                MAP_OUTPUT_RECORDS=531648
                MAP_OUTPUT_BYTES=1166967360
                MAP_OUTPUT_MATERIALIZED_BYTES=4402
                SPLIT_RAW_BYTES=989
                COMBINE_INPUT_RECORDS=531678
                COMBINE_OUTPUT_RECORDS=32
                REDUCE_INPUT_GROUPS=2
                REDUCE_SHUFFLE_BYTES=4402
                REDUCE_INPUT_RECORDS=2
                REDUCE_OUTPUT_RECORDS=2
                SPILLED_RECORDS=46
                SHUFFLED_MAPS=1
                FAILED_SHUFFLE=0
                MERGED_MAP_OUTPUTS=1
                GC_TIME_MILLIS=2566
                CPU_MILLISECONDS=69150
                PHYSICAL_MEMORY_BYTES=200130560
                VIRTUAL_MEMORY_BYTES=756015104
                COMMITTED_HEAP_BYTES=137433088
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
                BYTES_READ=0
        org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter
                BYTES_WRITTEN=4394

I think the problem is resolved in trunk already.

Subroto Sanyal added a comment - 12/Dec/11 05:12 Hi Matt, I ran the multifilewc in trunk (mrv2) and it runs fine. Output of the run: 2011-12-12 10:35:16,569 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1227)) - map 100% reduce 100% 2011-12-12 10:35:16,574 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1238)) - Job job_1322715021135_0002 completed successfully 2011-12-12 10:35:16,743 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1245)) - Counters: 43 File System Counters FILE: BYTES_READ=97054 FILE: BYTES_WRITTEN=190175 FILE: READ_OPS=0 FILE: LARGE_READ_OPS=0 FILE: WRITE_OPS=0 HDFS: BYTES_READ=1165096553 HDFS: BYTES_WRITTEN=4394 HDFS: READ_OPS=32 HDFS: LARGE_READ_OPS=0 HDFS: WRITE_OPS=4 org.apache.hadoop.mapreduce.JobCounter TOTAL_LAUNCHED_MAPS=1 TOTAL_LAUNCHED_REDUCES=1 OTHER_LOCAL_MAPS=1 SLOTS_MILLIS_MAPS=67860 SLOTS_MILLIS_REDUCES=65602 org.apache.hadoop.mapreduce.TaskCounter MAP_INPUT_RECORDS=265824 MAP_OUTPUT_RECORDS=531648 MAP_OUTPUT_BYTES=1166967360 MAP_OUTPUT_MATERIALIZED_BYTES=4402 SPLIT_RAW_BYTES=989 COMBINE_INPUT_RECORDS=531678 COMBINE_OUTPUT_RECORDS=32 REDUCE_INPUT_GROUPS=2 REDUCE_SHUFFLE_BYTES=4402 REDUCE_INPUT_RECORDS=2 REDUCE_OUTPUT_RECORDS=2 SPILLED_RECORDS=46 SHUFFLED_MAPS=1 FAILED_SHUFFLE=0 MERGED_MAP_OUTPUTS=1 GC_TIME_MILLIS=2566 CPU_MILLISECONDS=69150 PHYSICAL_MEMORY_BYTES=200130560 VIRTUAL_MEMORY_BYTES=756015104 COMMITTED_HEAP_BYTES=137433088 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter BYTES_READ=0 org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter BYTES_WRITTEN=4394 I think the problem is resolved in trunk already.

Matthew Foley added a comment - 15/Dec/11 09:10

Okay, Subroto, thanks.
Release 1.0.0 RC had to be re-spun, so merged this fix as promised.

Matthew Foley added a comment - 15/Dec/11 09:10 Okay, Subroto, thanks. Release 1.0.0 RC had to be re-spun, so merged this fix as promised.

Subroto Sanyal added a comment - 16/Dec/11 07:17

Thanks Matt...

Subroto Sanyal added a comment - 16/Dec/11 07:17 Thanks Matt...

Matthew Foley added a comment - 28/Dec/11 10:03

Closed upon release of version 1.0.0.

Matthew Foley added a comment - 28/Dec/11 10:03 Closed upon release of version 1.0.0.

People

Assignee:: Subroto Sanyal

Reporter:: Roman Shaposhnik

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 31/Oct/11 21:41

Updated:: 28/Dec/11 10:03

Resolved:: 15/Dec/11 09:10

Hadoop Map/Reduce

Details

Description

Attachments

Attachments

Activity

People

Dates