Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-7220

documentation lists options in wrong order

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • None
    • None
    • None
    • None
    • documentation

    Description

      On http://hadoop.apache.org/common/docs/r0.20.2/streaming.html various example use -D flags.

      I noticed if you invoke hadoop this way, it won't work.

      ========================
      dplaetin@n-0:/usr/local/hadoop/bin$ ./hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar -file /proj/Search/wall/experiment/ -mapper './build-models.py --mapper' -reducer './build-models.py --reducer' -input sim-input -output sim-output -D mapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator -D mapred.text.key.comparator.options=-k1,2n
      11/04/12 10:39:28 ERROR streaming.StreamJob: Unrecognized option: -D

      Usage: $HADOOP_HOME/bin/hadoop jar \
      $HADOOP_HOME/hadoop-streaming.jar [options]
      Options:
      -input <path> DFS input file(s) for the Map step
      -output <path> DFS output directory for the Reduce step
      -mapper <cmd|JavaClassName> The streaming command to run
      -combiner <JavaClassName> Combiner has to be a Java class
      -reducer <cmd|JavaClassName> The streaming command to run
      -file <file> File/dir to be shipped in the Job jar file
      -inputformat TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName Optional.
      -outputformat TextOutputFormat(default)|JavaClassName Optional.
      -partitioner JavaClassName Optional.
      -numReduceTasks <num> Optional.
      -inputreader <spec> Optional.
      -cmdenv <n>=<v> Optional. Pass env.var to streaming commands
      -mapdebug <path> Optional. To run this script when a map task fails
      -reducedebug <path> Optional. To run this script when a reduce task fails
      -verbose

      Generic options supported are
      -conf <configuration file> specify an application configuration file
      -D <property=value> use value for given property
      -fs <local|namenode:port> specify a namenode
      -jt <local|jobtracker:port> specify a job tracker
      -files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
      -libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
      -archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.

      The general command line syntax is
      bin/hadoop command [genericOptions] [commandOptions]

      For more details about these options:
      Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info

      Streaming Job Failed!
      ========================

      I could only make it work by moving the '-D flags to the front' (right after the streaming.jar part). maybe because it's a generic option, it needs to be in front or something.

      Attachments

        Activity

          People

            Unassigned Unassigned
            dieter_be Dieter Plaetinck
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified