Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3131

Docs and Scripts for setting up single node MRV2 cluster.

    Details

      Description

      Scripts to run a single node cluster with a default configuration. Takes care of running all the daemons including hdfs and yarn.

      1. MAPREDUCE-3131.patch
        55 kB
        Prashant Sharma
      2. MAPREDUCE-3131.patch
        58 kB
        Prashant Sharma
      3. MAPREDUCE-3131.patch
        34 kB
        Prashant Sharma
      4. MAPREDUCE-3131.patch
        14 kB
        Prashant Sharma

        Activity

        Prashant Sharma created issue -
        Prashant Sharma made changes -
        Field Original Value New Value
        Fix Version/s 0.23.0 [ 12315570 ]
        Prashant Sharma made changes -
        Component/s documentation [ 12312910 ]
        Component/s mrv2 [ 12314301 ]
        Hide
        Prashant Sharma added a comment -

        I am thinking of good and clean(and easy) way of starting of MRV2 Cluster using scripts. I will soon submit code and documentation for review. Till then take a look at http://github.com/prashantiiith/Scripts/wiki/Setup-required-to-play-with-Hadoop-MRV2

        thanks
        prashant

        Show
        Prashant Sharma added a comment - I am thinking of good and clean(and easy) way of starting of MRV2 Cluster using scripts. I will soon submit code and documentation for review. Till then take a look at http://github.com/prashantiiith/Scripts/wiki/Setup-required-to-play-with-Hadoop-MRV2 thanks prashant
        Hide
        Subroto Sanyal added a comment -

        hi Prashant,
        The existing documentation:
        1) ~hadoop-trunk/BUILDING.txt
        2) ~hadoop-trunk/hadoop-mapreduce-project/INSTALL
        clearly tells a user:
        a) How to build the project
        b) How to get the distributable?
        c) How to run the Yarn Cluster?

        IMO the existing documentation and scripts are enough.

        Show
        Subroto Sanyal added a comment - hi Prashant, The existing documentation: 1) ~hadoop-trunk/BUILDING.txt 2) ~hadoop-trunk/hadoop-mapreduce-project/INSTALL clearly tells a user: a) How to build the project b) How to get the distributable? c) How to run the Yarn Cluster? IMO the existing documentation and scripts are enough.
        Hide
        Prashant Sharma added a comment -

        Hi Subroto,

        Thanks for reading the doc. My effort is not to make them redundant but to provide an easier and generic way of doing it with scripts. I felt the current documentation of getting hdfs to run as daemon and then configuring mapred is spread across different documents. So initially it was difficult for me to get things running and I was looking for something close to what I have done (Please review it again. I have updated it.) And that became the motivation behind putting back to the community. I would be grateful to receive any suggestion and will try to improve it with evolution of the framework.

        Thanks again
        Prashant

        Show
        Prashant Sharma added a comment - Hi Subroto, Thanks for reading the doc. My effort is not to make them redundant but to provide an easier and generic way of doing it with scripts. I felt the current documentation of getting hdfs to run as daemon and then configuring mapred is spread across different documents. So initially it was difficult for me to get things running and I was looking for something close to what I have done (Please review it again. I have updated it.) And that became the motivation behind putting back to the community. I would be grateful to receive any suggestion and will try to improve it with evolution of the framework. Thanks again Prashant
        Prashant Sharma made changes -
        Labels documentation hadoop
        Fix Version/s 0.24.0 [ 12317654 ]
        Fix Version/s 0.23.0 [ 12315570 ]
        Affects Version/s 0.24.0 [ 12317654 ]
        Target Version/s 0.24.0 [ 12317654 ]
        Description Scripts to run a single node cluster with a default configuration. Takes care of running all the daemons including hdfs and yarn.
        Prashant Sharma made changes -
        Priority Major [ 3 ] Trivial [ 5 ]
        Prashant Sharma made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Prashant Sharma made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Prashant Sharma made changes -
        Attachment MAPREDUCE-3131.patch [ 12497425 ]
        Prashant Sharma made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12497425/MAPREDUCE-3131.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/921//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/921//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12497425/MAPREDUCE-3131.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/921//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/921//console This message is automatically generated.
        Hide
        Arun C Murthy added a comment -

        Prashanth, thanks for taking this up.

        No doubt, this is very useful for folks new to the project.

        Thinking a bit more, maybe we can add a bin/single-node directory where we check-in these scripts. To make them even more useful you can provide a way to run at least 2 (or more) DataNodes and 2 (or more) NodeManagers.

        You can also provide a documentation patch to hadoop-yarn-site/src/site/apt/SingleCluster.apt.vm. (mvn site:site to generate updated docs).

        Thoughts?

        Show
        Arun C Murthy added a comment - Prashanth, thanks for taking this up. No doubt, this is very useful for folks new to the project. Thinking a bit more, maybe we can add a bin/single-node directory where we check-in these scripts. To make them even more useful you can provide a way to run at least 2 (or more) DataNodes and 2 (or more) NodeManagers. You can also provide a documentation patch to hadoop-yarn-site/src/site/apt/SingleCluster.apt.vm. (mvn site:site to generate updated docs). Thoughts?
        Arun C Murthy made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Prashant Sharma added a comment -

        Hi Arun,

        I am not very old with hadoop So I have two questions.

        1. Why do we need more than one datanode and nodeManager in a single node setup.?
        2. Is there a standard documentation to do it? Because I was getting bind exceptions while trying to start two datanodes. Is it configurable somewhere? Or Can I figure it out myself.

        I have already improved the scripts with some enhancements. And was thinking having debug also as an option in it will make them more useful. Again It would be helpful if you can refer me to some standard doc for that.

        Thanks

        Show
        Prashant Sharma added a comment - Hi Arun, I am not very old with hadoop So I have two questions. 1. Why do we need more than one datanode and nodeManager in a single node setup.? 2. Is there a standard documentation to do it? Because I was getting bind exceptions while trying to start two datanodes. Is it configurable somewhere? Or Can I figure it out myself. I have already improved the scripts with some enhancements. And was thinking having debug also as an option in it will make them more useful. Again It would be helpful if you can refer me to some standard doc for that. Thanks
        Hide
        Subroto Sanyal added a comment -

        To run two datanodes on same system update the following properties in hdfs-site.xml

        <property>
          <name>dfs.datanode.address</name>
          <value>xxx.xxx.xxx.xxx:aaaaaa</value>
          <description>
            The address where the datanode server will listen to.
            If the port is 0 then the server will start on a free port.
          </description>
        </property>
        
        <property>
          <name>dfs.datanode.ipc.address</name>
          <value>xxx.xxx.xxx.xxx:zzzzzz</value>
          <description>
            The datanode ipc server address and port.
            If the port is 0 then the server will start on a free port.
          </description>
        </property>
        
        <property>
          <name>dfs.datanode.http.address</name>
          <value>xxx.xxx.xxx.xxx:yyyyy</value>
          <description>
            The datanode http server address and port.
            If the port is 0 then the server will start on a free port.
          </description>
        </property>
        
        Show
        Subroto Sanyal added a comment - To run two datanodes on same system update the following properties in hdfs-site.xml <property> <name>dfs.datanode.address</name> <value>xxx.xxx.xxx.xxx:aaaaaa</value> <description> The address where the datanode server will listen to. If the port is 0 then the server will start on a free port. </description> </property> <property> <name>dfs.datanode.ipc.address</name> <value>xxx.xxx.xxx.xxx:zzzzzz</value> <description> The datanode ipc server address and port. If the port is 0 then the server will start on a free port. </description> </property> <property> <name>dfs.datanode.http.address</name> <value>xxx.xxx.xxx.xxx:yyyyy</value> <description> The datanode http server address and port. If the port is 0 then the server will start on a free port. </description> </property>
        Hide
        Prashant Sharma added a comment -

        Well above config worked for running multiple datanodes. But for yarn
        <property>
        <name>yarn.nodemanager.address</name>
        <value>0.0.0.0:0</value>
        <description>the nodemanagers bind to this port</description>
        </property>

        Setting port to 0 here has no effect it always tries to bind to 4344

        I am sorry I am saying this without properly digging the code but i was getting bind exceptions. while starting more than one yarn NMs

        So is it like this feature hasn't been built into yarn yet? In the second case I don't mind doing so.

        Thanks.

        Show
        Prashant Sharma added a comment - Well above config worked for running multiple datanodes. But for yarn <property> <name>yarn.nodemanager.address</name> <value>0.0.0.0:0</value> <description>the nodemanagers bind to this port</description> </property> Setting port to 0 here has no effect it always tries to bind to 4344 I am sorry I am saying this without properly digging the code but i was getting bind exceptions. while starting more than one yarn NMs So is it like this feature hasn't been built into yarn yet? In the second case I don't mind doing so. Thanks.
        Prashant Sharma made changes -
        Attachment MAPREDUCE-3131.patch [ 12497978 ]
        Prashant Sharma made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Prashant Sharma added a comment -

        Please ignore the above comment. There was a problem with using ephemeral ports for the property "mapreduce.shuffle.port" which got fixed after an svn update.

        Now the task is complete with all features.

        Show
        Prashant Sharma added a comment - Please ignore the above comment. There was a problem with using ephemeral ports for the property "mapreduce.shuffle.port" which got fixed after an svn update. Now the task is complete with all features.
        Prashant Sharma made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Prashant Sharma added a comment -

        Task Completed !.

        Show
        Prashant Sharma added a comment - Task Completed !.
        Prashant Sharma made changes -
        Attachment MAPREDUCE-3131.patch [ 12497982 ]
        Prashant Sharma made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12497982/MAPREDUCE-3131.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/955//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/955//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12497982/MAPREDUCE-3131.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/955//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/955//console This message is automatically generated.
        Prashant Sharma made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Prashant Sharma made changes -
        Attachment MAPREDUCE-3131.patch [ 12498032 ]
        Prashant Sharma made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12498032/MAPREDUCE-3131.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/956//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/956//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12498032/MAPREDUCE-3131.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/956//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/956//console This message is automatically generated.
        Mahadev konar made changes -
        Assignee Prashant Sharma [ prashant_ ]
        Prashant Sharma logged work - 07/Oct/11 20:26
        • Time Spent:
          48h
           
          <No comment>
        Prashant Sharma made changes -
        Remaining Estimate 168h [ 604800 ] 120h [ 432000 ]
        Time Spent 48h [ 172800 ]
        Worklog Id 12135 [ 12135 ]
        Prashant Sharma made changes -
        Attachment MAPREDUCE-3131.patch [ 12497982 ]
        Prashant Sharma made changes -
        Attachment MAPREDUCE-3131.patch [ 12497978 ]
        Hide
        Prashant Sharma added a comment -

        A new feature : Options for debugging daemons.

        Show
        Prashant Sharma added a comment - A new feature : Options for debugging daemons.
        Prashant Sharma made changes -
        Status Patch Available [ 10002 ] In Progress [ 3 ]
        Hide
        Arun C Murthy added a comment -

        Why do we need more than one datanode and nodeManager in a single node setup.?

        Prashanth - we already have docs/scripts for running all of Hadoop on a single cluster - they are the standard bin/hdfs & bin/yarn scripts.


        Now, one possible enhancement is to go through the default values of configs and check which ones cause it to not work properly on a single node out of the box.

        If there is none, we declare victory.

        If there are some we cannot make default (for e.g. they break tests etc.) we can then check in a conf/singlenode-hdfs-site.xml and conf/singlenode-yarn-site.xml.


        Another enhancement is to check in configs & scripts with which people can run multiple DataNodes/NodeManagers on a single node so that folks get a feel for a 'real' cluster on a single machine to test/develop/experience.


        I'm happy to have you do either or both of the above.

        However, checking in another set of scripts to run a single node install with 1 NN/DN/RM/NN is just a maintenance overhead.

        Makes sense?

        Show
        Arun C Murthy added a comment - Why do we need more than one datanode and nodeManager in a single node setup.? Prashanth - we already have docs/scripts for running all of Hadoop on a single cluster - they are the standard bin/hdfs & bin/yarn scripts. Now, one possible enhancement is to go through the default values of configs and check which ones cause it to not work properly on a single node out of the box . If there is none, we declare victory. If there are some we cannot make default (for e.g. they break tests etc.) we can then check in a conf/singlenode-hdfs-site.xml and conf/singlenode-yarn-site.xml. Another enhancement is to check in configs & scripts with which people can run multiple DataNodes/NodeManagers on a single node so that folks get a feel for a 'real' cluster on a single machine to test/develop/experience. I'm happy to have you do either or both of the above. However, checking in another set of scripts to run a single node install with 1 NN/DN/RM/NN is just a maintenance overhead. Makes sense?
        Hide
        Prashant Sharma added a comment -

        Arun,

        You are right, Adding another set of scripts should have some value more than just starting a cluster without much hassle.

        I made these scripts because it made sense for me and gave me the feel of all possible options I can break the cluster. And I had all weird(possibly) Ideas like trying out multiple "Single node" Cluster on the same machine and trying out variety of things(like distcp).

        So, In order to bring in value with a new feature. I have already implemented following(I have not submitted the patch though) :

        *To setup the cluster home in one step

         ./run.sh prepare 
        

        *Start stop and number of NM or Datanodes.

         ./run.sh -NM <No of Nodemanager daemons> -D <No of datanode daemons> [start|afresh|stop|kill]
        

        *Debugging and changing between or choosing between daemons for debugging would be breeze.

         ./run.sh -NM <port no> -RM <port no> -DN <port no> -NN <portno> -JH <port no> debug
        

        I will soon submit the patch with much better code quality. I understand the hassle of maintaining a set of badly written code (Specially scripts). Also I have already written some default configs ideal for single node cluster. I liked the naming conventions.

        Now do you see some value.? I would like to have you people throwing in ideas.

        Thanks

        Show
        Prashant Sharma added a comment - Arun, You are right, Adding another set of scripts should have some value more than just starting a cluster without much hassle. I made these scripts because it made sense for me and gave me the feel of all possible options I can break the cluster. And I had all weird(possibly) Ideas like trying out multiple "Single node" Cluster on the same machine and trying out variety of things(like distcp). So, In order to bring in value with a new feature. I have already implemented following(I have not submitted the patch though) : *To setup the cluster home in one step ./run.sh prepare *Start stop and number of NM or Datanodes. ./run.sh -NM <No of Nodemanager daemons> -D <No of datanode daemons> [start|afresh|stop|kill] *Debugging and changing between or choosing between daemons for debugging would be breeze. ./run.sh -NM <port no> -RM <port no> -DN <port no> -NN <portno> -JH <port no> debug I will soon submit the patch with much better code quality. I understand the hassle of maintaining a set of badly written code (Specially scripts). Also I have already written some default configs ideal for single node cluster. I liked the naming conventions. Now do you see some value.? I would like to have you people throwing in ideas. Thanks
        Hide
        Prashant Sharma added a comment -

        Edit in step 2

        ./run.sh -NM <No of Nodemanager daemons> -D <No of datanode daemons> [start|afresh|stop|kill|status]
        

        Added status option. Currently work in progress. Status will check pid files of process and querry system about the status of those processes. Would be useful in case you run multiple clusters.

        Show
        Prashant Sharma added a comment - Edit in step 2 ./run.sh -NM <No of Nodemanager daemons> -D <No of datanode daemons> [start|afresh|stop|kill|status] Added status option. Currently work in progress. Status will check pid files of process and querry system about the status of those processes. Would be useful in case you run multiple clusters.
        Arun C Murthy made changes -
        Parent MAPREDUCE-2890 [ 12520465 ]
        Issue Type Sub-task [ 7 ] Improvement [ 4 ]
        Show
        Prashant Sharma added a comment - DEPENDS on https://issues.apache.org/jira/browse/MAPREDUCE-3211
        Hide
        Prashant Sharma added a comment -

        New improved. This currently depends on fixing of MAPREDUCE-2986. Till then you can remove all the ephemeral port configurations and run only single daemons of all types. Or you may run multiple daemons but dont submit the jobs.

        Show
        Prashant Sharma added a comment - New improved. This currently depends on fixing of MAPREDUCE-2986 . Till then you can remove all the ephemeral port configurations and run only single daemons of all types. Or you may run multiple daemons but dont submit the jobs.
        Prashant Sharma made changes -
        Attachment MAPREDUCE-3131.patch [ 12499821 ]
        Prashant Sharma logged work - 20/Oct/11 09:04
        • Time Spent:
          48h
           
          <No comment>
        Prashant Sharma made changes -
        Remaining Estimate 120h [ 432000 ] 72h [ 259200 ]
        Time Spent 48h [ 172800 ] 96h [ 345600 ]
        Worklog Id 12234 [ 12234 ]
        Prashant Sharma made changes -
        Status In Progress [ 3 ] Open [ 1 ]
        Prashant Sharma made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12499821/MAPREDUCE-3131.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 160 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1080//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1080//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1080//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1080//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1080//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12499821/MAPREDUCE-3131.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 160 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1080//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1080//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1080//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1080//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1080//console This message is automatically generated.
        Hide
        Prashant Sharma added a comment -
        Show
        Prashant Sharma added a comment - For convenience read the wiki here. http://github.com/ScrapCodes/Scripts/wiki/Setup-required-to-play-with-Hadoop-MRV2
        Hide
        Prashant Sharma added a comment -

        New Improved.
        Improved documentation for setting up single Node Cluster using scripts.

        Waiting for reviews.

        Show
        Prashant Sharma added a comment - New Improved. Improved documentation for setting up single Node Cluster using scripts. Waiting for reviews.
        Prashant Sharma made changes -
        Attachment MAPREDUCE-3131.patch [ 12501949 ]
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12501949/MAPREDUCE-3131.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1239//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1239//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12501949/MAPREDUCE-3131.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1239//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1239//console This message is automatically generated.
        Hide
        Prashant Sharma added a comment -

        Hadoop Mapreduce has many users exploiting its usefulness and simplicity. Some of them are students(or non industrial users) and need it the power of mapreduce for solving their research problems or more commonly even more trivial problems(like generating indexes or just run WordCount ).

        So was thinking a bit further - if I can make it (very) easy to setup a trivial(unsecured and with default parameters) cluster(Created by sharing desktops resources - A typical scenario among students). Our folks will have a by pass to figuring out how to setup.

        There are two approaches to this.
        1. make a tar and copy it around. or( Rsync the cluster home)
        2. use deb/rpm approach.

        deb and rpm setup will be more ideal but a bit nontrivial to have and will make steps more cumbersome. But doing without package management simplifies "a few" things. And since users of our trivial clusters are less particular about the "Ideal" approach I propose the later.

        Show
        Prashant Sharma added a comment - Hadoop Mapreduce has many users exploiting its usefulness and simplicity. Some of them are students(or non industrial users) and need it the power of mapreduce for solving their research problems or more commonly even more trivial problems(like generating indexes or just run WordCount ). So was thinking a bit further - if I can make it (very) easy to setup a trivial(unsecured and with default parameters) cluster(Created by sharing desktops resources - A typical scenario among students). Our folks will have a by pass to figuring out how to setup. There are two approaches to this. 1. make a tar and copy it around. or( Rsync the cluster home) 2. use deb/rpm approach. deb and rpm setup will be more ideal but a bit nontrivial to have and will make steps more cumbersome. But doing without package management simplifies "a few" things. And since users of our trivial clusters are less particular about the "Ideal" approach I propose the later.
        Hide
        Arun C Murthy added a comment -

        Prashanth, this is great progress. Sorry, for being late on this one.

        Thinking a bit more...

        1. It would be really nice to not require a new 'run.sh' script. Can we just get bin/hadoop-daemon.sh and bin/yarn-daemon.sh to start/stop multiple DNs/NMs on the same node? Then we can just document this in our site-docs.
        2. Can we add the -debug functionality to all daemons?

        Basically, I'm trying to avoid maintain another set of scripts (run.sh) and put everything in the main scripts/configs...

        Show
        Arun C Murthy added a comment - Prashanth, this is great progress. Sorry, for being late on this one. Thinking a bit more... It would be really nice to not require a new 'run.sh' script. Can we just get bin/hadoop-daemon.sh and bin/yarn-daemon.sh to start/stop multiple DNs/NMs on the same node? Then we can just document this in our site-docs. Can we add the -debug functionality to all daemons? Basically, I'm trying to avoid maintain another set of scripts (run.sh) and put everything in the main scripts/configs...
        Arun C Murthy made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Prashant Sharma made changes -
        Assignee Prashant Sharma [ prashant_ ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Prashant Sharma
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - 168h
              168h
              Remaining:
              Time Spent - 96h Remaining Estimate - 72h
              72h
              Logged:
              Time Spent - 96h Remaining Estimate - 72h
              96h

                Development