Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1851

Document configuration parameters in streaming

    Details

    • Hadoop Flags:
      Reviewed

      Description

      There are several streaming options such as stream.map.output.field.separator, stream.num.map.output.key.fields, stream.map.input.field.separator, stream.reduce.input.field.separator, stream.map.input.ignoreKey, stream.non.zero.exit.is.failure etc which are spread everywhere. These should be documented at single place with description and default-value.

      1. patch-1851.txt
        6 kB
        Amareshwari Sriramadasu
      2. patch-1851-1.txt
        6 kB
        Amareshwari Sriramadasu
      3. patch-1851-2.txt
        6 kB
        Amareshwari Sriramadasu

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #523 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/523/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #523 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/523/ )
        Hide
        Amareshwari Sriramadasu added a comment -

        Thanks for the review Ravi.
        I just committed this!

        Show
        Amareshwari Sriramadasu added a comment - Thanks for the review Ravi. I just committed this!
        Hide
        Ravi Gummadi added a comment -

        Latest patch looks good.
        +1

        Show
        Ravi Gummadi added a comment - Latest patch looks good. +1
        Hide
        Amareshwari Sriramadasu added a comment -

        Modified the description for the property. Ran ant docs with the patch successfully.

        Show
        Amareshwari Sriramadasu added a comment - Modified the description for the property. Ran ant docs with the patch successfully.
        Hide
        Ravi Gummadi added a comment -

        The description of stream.joindelay.milli "Timeout in milliseconds for error thread and output thread to die" seems to be misleading if this is number of milli seconds for which the error thread and output thread are alive(time from start of thread to die). May be we should mention that this is the amount of time we wait for joining the error and output threads at the end of mapper/reducer ?

        Show
        Ravi Gummadi added a comment - The description of stream.joindelay.milli "Timeout in milliseconds for error thread and output thread to die" seems to be misleading if this is number of milli seconds for which the error thread and output thread are alive(time from start of thread to die). May be we should mention that this is the amount of time we wait for joining the error and output threads at the end of mapper/reducer ?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12447214/patch-1851-1.txt
        against trunk revision 955198.

        +1 @author. The patch does not contain any @author tags.

        +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/248/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/248/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/248/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/248/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447214/patch-1851-1.txt against trunk revision 955198. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/248/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/248/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/248/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/248/console This message is automatically generated.
        Hide
        Amareshwari Sriramadasu added a comment -

        Ran ant docs with patch successfully.

        Show
        Amareshwari Sriramadasu added a comment - Ran ant docs with patch successfully.
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch incorporates Ravi's comments. Also removes stream..input.writer.class and stream..output.reader.class from the table, since they are also internal and not set by user.

        Show
        Amareshwari Sriramadasu added a comment - Patch incorporates Ravi's comments. Also removes stream. .input.writer.class and stream. .output.reader.class from the table, since they are also internal and not set by user.
        Hide
        Ravi Gummadi added a comment -

        We could also specify for the 4 properties stream.map.input, stream.map.output, stream.reduce.input and stream.reduce.input that these will take the values given with -D only if -io <identifier> is not used. In other words, Should we say that "-io <identifier>" will replace these 4 properties with the <identifier> ?

        Show
        Ravi Gummadi added a comment - We could also specify for the 4 properties stream.map.input, stream.map.output, stream.reduce.input and stream.reduce.input that these will take the values given with -D only if -io <identifier> is not used. In other words, Should we say that "-io <identifier>" will replace these 4 properties with the <identifier> ?
        Hide
        Ravi Gummadi added a comment -

        It seems stream.jobLog_ is not documented anywhere and seems to be not useful. We can remove it altogether, may be in a separate JIRA. So let us not document that here ?

        stream.addenvironment seems to be internal property and is not intended for hadoop streaming users. Let us not document it.

        We can add the config property stream.stderr.reporter.prefix with the default value "reporter:". This would need changes to the questions/answers related to "update status" and "update counter" in FAQ ?

        Show
        Ravi Gummadi added a comment - It seems stream.jobLog_ is not documented anywhere and seems to be not useful. We can remove it altogether, may be in a separate JIRA. So let us not document that here ? stream.addenvironment seems to be internal property and is not intended for hadoop streaming users. Let us not document it. We can add the config property stream.stderr.reporter.prefix with the default value "reporter:". This would need changes to the questions/answers related to "update status" and "update counter" in FAQ ?
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch is just a documentation change. Test failures are definitely not related to the patch.

        Show
        Amareshwari Sriramadasu added a comment - Patch is just a documentation change. Test failures are definitely not related to the patch.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12446858/patch-1851.txt
        against trunk revision 953660.

        +1 @author. The patch does not contain any @author tags.

        +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/235/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/235/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/235/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/235/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446858/patch-1851.txt against trunk revision 953660. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/235/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/235/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/235/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/235/console This message is automatically generated.
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch documents all the streaming configuration parameters in a table in streaming.xml

        Show
        Amareshwari Sriramadasu added a comment - Patch documents all the streaming configuration parameters in a table in streaming.xml

          People

          • Assignee:
            Amareshwari Sriramadasu
            Reporter:
            Amareshwari Sriramadasu
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development