|
[
Permlink
| « Hide
]
Matei Zaharia added a comment - 10/Jul/08 12:01 AM
Here's a patch that lets you use bin/hadoop -C property=value [command].
Good idea. Since the -D key=value syntax is managed by the Tool/ToolRunner, er, toolchain (see
Regarding making streaming, pipes, etc use ToolRunner - I think that could be more complicated than it seems because you'd need to change the existing argument parsing in those libraries. People who have modified their streaming or pipes implementations would also have trouble (for example, we have a modified streaming at Facebook). Any new tool implementers can choose to use ToolRunner if they want, but this method lets you just write a simple Java class that calls submitJob and still be able to send parameters from bin/hadoop.
I'm with Chris on this one, I don't think we need yet another way to pass config options along with -Dkey=value and -jobconf. Rather we need to standardize. So, it does make sense to pick one (-D or -jobconf) and stick with it. Yes, it means we will need to fix streaming/pipes or ToolRunner - we should.
Ideally we should :
+1 for Enis's solution.
That said, a solution like the one in Matei's patch might be a tolerable, short-term bridge between 0.17 and 0.18 for user code affected by -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12385687/HADOOP-3722.patch against trunk revision 676069. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2837/testReport/ This message is automatically generated. This patch
I will really appreciate if someone with real streaming / pipes usage can test this out. -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12389176/jobconfoptions_v1.patch against trunk revision 690641. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3149/testReport/ This message is automatically generated. -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12389435/jobconfoptions_v2.patch against trunk revision 692409. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3189/testReport/ This message is automatically generated. Failing test is not related to this patch.
+1, this is looking great!
I'll get some 'expert' Streaming users to take a brief look and then go ahead and commit this. OTOH, I've changed my mind - I believe it's fine to commit this as-is and deal with the consequences later since this is an important cleanup.
I just committed this. Thanks, Enis!
This appears to be an incompatible change. I am wondering if the older job-parameters-submitting -methods were deprecated (but still works with 0.19) or have they been removed completely?
The patch only deprecates parameters, issuing a warning, and introduces new ones. However in streaming, there were some parameters, like -cluster, which were not working so I just removed them.
Enis, could you please add a detailed 'Release Note' for this jira? Thanks!
Integrated in Hadoop-trunk #611 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/
This issue
1. changed StreamJob(of streaming) and Submitter(of pipes) to implement Tool and Configurable. Streaming and submitter now accepts GenericOptionsParser arguments : -fs, -jt, -conf, -D, -libjars, -files, -archives 2. Deprecated -jobconf, -cacheArchive, -dfs, -cacheArchive, -additionalconfspec, from streaming and pipes(where applicable) in favor of the generic options. The options still work issuing a warning as a side effect, however they may be later removed in the following releases. 3. removed from streaming : |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||