Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3319

multifilewc from hadoop examples seems to be broken in 0.20.205.0

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0
    • Fix Version/s: 1.0.0
    • Component/s: examples
    • Labels:
    • Target Version/s:

      Description

      /usr/lib/hadoop/bin/hadoop jar /usr/lib/hadoop/hadoop-examples-0.20.205.0.22.jar multifilewc  examples/text examples-output/multifilewc
      11/10/31 16:50:26 INFO mapred.FileInputFormat: Total input paths to process : 2
      11/10/31 16:50:26 INFO mapred.JobClient: Running job: job_201110311350_0220
      11/10/31 16:50:27 INFO mapred.JobClient:  map 0% reduce 0%
      11/10/31 16:50:42 INFO mapred.JobClient: Task Id : attempt_201110311350_0220_m_000000_0, Status : FAILED
      java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.LongWritable
      	at org.apache.hadoop.mapred.lib.LongSumReducer.reduce(LongSumReducer.java:44)
      	at org.apache.hadoop.mapred.Task$OldCombinerRunner.combine(Task.java:1431)
      	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1436)
      	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
      	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
      	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:396)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
      	at org.apache.hadoop.mapred.Child.main(Child.java:249)
      

        Activity

        Hide
        Matt Foley added a comment -

        Closed upon release of version 1.0.0.

        Show
        Matt Foley added a comment - Closed upon release of version 1.0.0.
        Hide
        Subroto Sanyal added a comment -

        Thanks Matt...

        Show
        Subroto Sanyal added a comment - Thanks Matt...
        Hide
        Matt Foley added a comment -

        Okay, Subroto, thanks.
        Release 1.0.0 RC had to be re-spun, so merged this fix as promised.

        Show
        Matt Foley added a comment - Okay, Subroto, thanks. Release 1.0.0 RC had to be re-spun, so merged this fix as promised.
        Hide
        Subroto Sanyal added a comment -

        Hi Matt,

        I ran the multifilewc in trunk (mrv2) and it runs fine.
        Output of the run:

        2011-12-12 10:35:16,569 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1227))
        -  map 100% reduce 100%
        2011-12-12 10:35:16,574 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1238))
        - Job job_1322715021135_0002 completed successfully
        2011-12-12 10:35:16,743 INFO  mapreduce.Job (Job.java:monitorAndPrintJob(1245))
        - Counters: 43
                File System Counters
                        FILE: BYTES_READ=97054
                        FILE: BYTES_WRITTEN=190175
                        FILE: READ_OPS=0
                        FILE: LARGE_READ_OPS=0
                        FILE: WRITE_OPS=0
                        HDFS: BYTES_READ=1165096553
                        HDFS: BYTES_WRITTEN=4394
                        HDFS: READ_OPS=32
                        HDFS: LARGE_READ_OPS=0
                        HDFS: WRITE_OPS=4
                org.apache.hadoop.mapreduce.JobCounter
                        TOTAL_LAUNCHED_MAPS=1
                        TOTAL_LAUNCHED_REDUCES=1
                        OTHER_LOCAL_MAPS=1
                        SLOTS_MILLIS_MAPS=67860
                        SLOTS_MILLIS_REDUCES=65602
                org.apache.hadoop.mapreduce.TaskCounter
                        MAP_INPUT_RECORDS=265824
                        MAP_OUTPUT_RECORDS=531648
                        MAP_OUTPUT_BYTES=1166967360
                        MAP_OUTPUT_MATERIALIZED_BYTES=4402
                        SPLIT_RAW_BYTES=989
                        COMBINE_INPUT_RECORDS=531678
                        COMBINE_OUTPUT_RECORDS=32
                        REDUCE_INPUT_GROUPS=2
                        REDUCE_SHUFFLE_BYTES=4402
                        REDUCE_INPUT_RECORDS=2
                        REDUCE_OUTPUT_RECORDS=2
                        SPILLED_RECORDS=46
                        SHUFFLED_MAPS=1
                        FAILED_SHUFFLE=0
                        MERGED_MAP_OUTPUTS=1
                        GC_TIME_MILLIS=2566
                        CPU_MILLISECONDS=69150
                        PHYSICAL_MEMORY_BYTES=200130560
                        VIRTUAL_MEMORY_BYTES=756015104
                        COMMITTED_HEAP_BYTES=137433088
                Shuffle Errors
                        BAD_ID=0
                        CONNECTION=0
                        IO_ERROR=0
                        WRONG_LENGTH=0
                        WRONG_MAP=0
                        WRONG_REDUCE=0
                org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
                        BYTES_READ=0
                org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter
                        BYTES_WRITTEN=4394
        

        I think the problem is resolved in trunk already.

        Show
        Subroto Sanyal added a comment - Hi Matt, I ran the multifilewc in trunk (mrv2) and it runs fine. Output of the run: 2011-12-12 10:35:16,569 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1227)) - map 100% reduce 100% 2011-12-12 10:35:16,574 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1238)) - Job job_1322715021135_0002 completed successfully 2011-12-12 10:35:16,743 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1245)) - Counters: 43 File System Counters FILE: BYTES_READ=97054 FILE: BYTES_WRITTEN=190175 FILE: READ_OPS=0 FILE: LARGE_READ_OPS=0 FILE: WRITE_OPS=0 HDFS: BYTES_READ=1165096553 HDFS: BYTES_WRITTEN=4394 HDFS: READ_OPS=32 HDFS: LARGE_READ_OPS=0 HDFS: WRITE_OPS=4 org.apache.hadoop.mapreduce.JobCounter TOTAL_LAUNCHED_MAPS=1 TOTAL_LAUNCHED_REDUCES=1 OTHER_LOCAL_MAPS=1 SLOTS_MILLIS_MAPS=67860 SLOTS_MILLIS_REDUCES=65602 org.apache.hadoop.mapreduce.TaskCounter MAP_INPUT_RECORDS=265824 MAP_OUTPUT_RECORDS=531648 MAP_OUTPUT_BYTES=1166967360 MAP_OUTPUT_MATERIALIZED_BYTES=4402 SPLIT_RAW_BYTES=989 COMBINE_INPUT_RECORDS=531678 COMBINE_OUTPUT_RECORDS=32 REDUCE_INPUT_GROUPS=2 REDUCE_SHUFFLE_BYTES=4402 REDUCE_INPUT_RECORDS=2 REDUCE_OUTPUT_RECORDS=2 SPILLED_RECORDS=46 SHUFFLED_MAPS=1 FAILED_SHUFFLE=0 MERGED_MAP_OUTPUTS=1 GC_TIME_MILLIS=2566 CPU_MILLISECONDS=69150 PHYSICAL_MEMORY_BYTES=200130560 VIRTUAL_MEMORY_BYTES=756015104 COMMITTED_HEAP_BYTES=137433088 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter BYTES_READ=0 org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter BYTES_WRITTEN=4394 I think the problem is resolved in trunk already.
        Hide
        Matt Foley added a comment -

        It appears this change is also applicable to trunk, file
        hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/MultiFileWordCount.java

        Subroto, please port your patch to trunk and post another patch.
        Then I can resolve this jira. Thank you.

        Show
        Matt Foley added a comment - It appears this change is also applicable to trunk, file hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/MultiFileWordCount.java Subroto, please port your patch to trunk and post another patch. Then I can resolve this jira. Thank you.
        Hide
        Matt Foley added a comment -

        Subroto, thanks for providing a patch. Unfortunately you also set the "Fix Version" to 1.0.0, which implies it is already fixed in 1.0.0, which it apparently isn't, according to Roman's email of today. (Although Roman's comment of 28/Nov/11 21:49 also seems to imply it was fixed in 1.0.0. Perhaps that was user error.)

        Please do not set "Fix Version" until after the patch is committed to a given version. The intent or desire to have a fix in a given version should be expressed in "Target Version", not "Fix Version".

        +1 for code review, lgtm. Committed to branch-1 for future release 1.1.0.
        Thanks Subroto!

        Show
        Matt Foley added a comment - Subroto, thanks for providing a patch. Unfortunately you also set the "Fix Version" to 1.0.0, which implies it is already fixed in 1.0.0, which it apparently isn't, according to Roman's email of today. (Although Roman's comment of 28/Nov/11 21:49 also seems to imply it was fixed in 1.0.0. Perhaps that was user error.) Please do not set "Fix Version" until after the patch is committed to a given version. The intent or desire to have a fix in a given version should be expressed in "Target Version", not "Fix Version". +1 for code review, lgtm. Committed to branch-1 for future release 1.1.0. Thanks Subroto!
        Hide
        Subroto Sanyal added a comment -

        The patch is prepared on branch:
        http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1

        The patch resolves an issue in hadoop-example; so tests not included.

        Show
        Subroto Sanyal added a comment - The patch is prepared on branch: http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1 The patch resolves an issue in hadoop-example; so tests not included.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12506241/MAPREDUCE-3319.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1399//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506241/MAPREDUCE-3319.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1399//console This message is automatically generated.
        Hide
        Subroto Sanyal added a comment -

        The patch makes the code comapatible with LongWritable as final Output value class

        Show
        Subroto Sanyal added a comment - The patch makes the code comapatible with LongWritable as final Output value class
        Hide
        Roman Shaposhnik added a comment -

        Ok, so this appears to be fixed in branch-1.0 rev 1207581

        Is this the branch from which 1.0.0 will be cut from?

        Show
        Roman Shaposhnik added a comment - Ok, so this appears to be fixed in branch-1.0 rev 1207581 Is this the branch from which 1.0.0 will be cut from?
        Hide
        Roman Shaposhnik added a comment -

        Matt, what's the cut-off date for providing a patch? I'd really like to see this fixed in 1.0.

        Show
        Roman Shaposhnik added a comment - Matt, what's the cut-off date for providing a patch? I'd really like to see this fixed in 1.0.
        Hide
        Matt Foley added a comment -

        Moved requested target versions from Fix Version to Target Version field.
        Absent a viable patch in time for 1.0.0, this will have to go in 1.1.0.

        Show
        Matt Foley added a comment - Moved requested target versions from Fix Version to Target Version field. Absent a viable patch in time for 1.0.0, this will have to go in 1.1.0.
        Hide
        Roman Shaposhnik added a comment -

        I really think we should fix this before releasing any version that has a potential of becoming Hadoop 1.0. I'll try to see if I can cook up a patch to fix it. If not – I'd say it is better to disable this example, rather than shipping a broken one.

        Agreed?

        Show
        Roman Shaposhnik added a comment - I really think we should fix this before releasing any version that has a potential of becoming Hadoop 1.0. I'll try to see if I can cook up a patch to fix it. If not – I'd say it is better to disable this example, rather than shipping a broken one. Agreed?

          People

          • Assignee:
            Subroto Sanyal
            Reporter:
            Roman Shaposhnik
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development