Details

    • Type: Task Task
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: documentation
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      (Updated Summary and Description)

      The cluster_setup.xml file is in 2 places: common-trunk and mapreduce-trunk.

      The single_node_setup.xml file is in 1 place: common-trunk.

      Issues:

      (1) Remove duplication - cluster_setup.xml should only be in 1 trunk (no duplication of files)

      (2) Both files stay together - cluster_setup.xml and single_node_setup.xml should be together in the same location (trunk)

      (3) Which trunk - originally, both files were assigned to the common-trunk during the doc split that occured the summer of 2009.

      Solutions:

      (1) have both files live in common-trunk ... OR ...

      (2) have both files live in mapreduce-trunk

      This ticket affects trunk and branch-0.21

      1. MAPREDUCE-1404.patch
        96 kB
        Tom White
      2. MAPREDUCE-1404.patch
        181 kB
        Tom White

        Issue Links

          Activity

          Hide
          Hemanth Yamijala added a comment -

          Is this the right direction ? cluster_setup.xml contains elaborate steps to properly configure an HDFS / mapreduce cluster. Shouldn't the mapreduce parts be available under the mapreduce project, in the spirit of project split ? Indeed, given that 'common' is just a set of utility libraries, the real crux of cluster_setup will be in the HDFS and mapreduce subprojects - like how to setup an HDFS cluster and how to setup a mapreduce cluster.

          From a developer / committer perspective, any patch of HDFS or mapreduce that introduces a change that we think influences cluster setup of the respective component will then generate two JIRAs - one for the code change and another for the documentation. The synchronization of these patches will become an overhead for committers, IMHO.

          If we do decide to go with just one version under common, we should make sure to merge changes done to the mapreduce parts of cluster_setup.xml with common's version. I remember having reviewed changes to cluster_setup.xml (involving mapreduce parts) and committing them here. But I'd be more interested to know if we can come up with a different organization.

          Show
          Hemanth Yamijala added a comment - Is this the right direction ? cluster_setup.xml contains elaborate steps to properly configure an HDFS / mapreduce cluster. Shouldn't the mapreduce parts be available under the mapreduce project, in the spirit of project split ? Indeed, given that 'common' is just a set of utility libraries, the real crux of cluster_setup will be in the HDFS and mapreduce subprojects - like how to setup an HDFS cluster and how to setup a mapreduce cluster. From a developer / committer perspective, any patch of HDFS or mapreduce that introduces a change that we think influences cluster setup of the respective component will then generate two JIRAs - one for the code change and another for the documentation. The synchronization of these patches will become an overhead for committers, IMHO. If we do decide to go with just one version under common, we should make sure to merge changes done to the mapreduce parts of cluster_setup.xml with common's version. I remember having reviewed changes to cluster_setup.xml (involving mapreduce parts) and committing them here. But I'd be more interested to know if we can come up with a different organization.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          I am for having them separately as two files in mapreduce and hdfs projects describing the set up of corresponding clusters and possibly linked together in common project. Thoughts?

          Resolving MAPREDUCE-1039 as duplicate of this issue.

          Show
          Vinod Kumar Vavilapalli added a comment - I am for having them separately as two files in mapreduce and hdfs projects describing the set up of corresponding clusters and possibly linked together in common project. Thoughts? Resolving MAPREDUCE-1039 as duplicate of this issue.
          Hide
          Corinne Chandel added a comment -

          This ticket is addressing the duplication of the files in the TRUNKs.

          Given Hemanth's comment, I propose the following:

          > cluster_setup.xml – keep in the M/R TRUNK and delete from the Common TRUNK
          > single_node_setup.xml – add to the M/R TRUNK and delete from the Common TRUNK

          Later, if you want to split up the cluster_setup guide (MR v. HDFS) you can:
          > edit the cluster_setp.xml file in the M/R TRUNK (remove the HDFS stuff)
          > create a new hdfs_cluster_setup.xml file in the HDFS TRUNK

          Show
          Corinne Chandel added a comment - This ticket is addressing the duplication of the files in the TRUNKs. Given Hemanth's comment, I propose the following: > cluster_setup.xml – keep in the M/R TRUNK and delete from the Common TRUNK > single_node_setup.xml – add to the M/R TRUNK and delete from the Common TRUNK Later, if you want to split up the cluster_setup guide (MR v. HDFS) you can: > edit the cluster_setp.xml file in the M/R TRUNK (remove the HDFS stuff) > create a new hdfs_cluster_setup.xml file in the HDFS TRUNK
          Hide
          Owen O'Malley added a comment -

          I think we should keep the cluster and single node setup documentation in Common. I think that it is best left there so that we can minimize duplication of effort. Otherwise there will be a large duplication between the HDFS and MapReduce versions of the documentation and having a single document telling people how to setup clusters is better.

          Show
          Owen O'Malley added a comment - I think we should keep the cluster and single node setup documentation in Common. I think that it is best left there so that we can minimize duplication of effort. Otherwise there will be a large duplication between the HDFS and MapReduce versions of the documentation and having a single document telling people how to setup clusters is better.
          Hide
          Corinne Chandel added a comment -

          More comments:

          (1) The Common Overview page (index.html) currently reads as if Single Node Setup and Cluster Setup are both under Common. If we go with Owen's comment (keeping Single Node Setup and Cluster Setup under Common), then no change needs to be made to the Overview page.

          (2) If we go with Owen's comment (keeping Single Node Setup and Cluster Setup under Common), then the M/R cluster_setup.xml file may be more current than the Common cluster_setup.xml file. Thus, the M/R file needs to replace the Common file (and then be deleted from M/R).

          Show
          Corinne Chandel added a comment - More comments: (1) The Common Overview page (index.html) currently reads as if Single Node Setup and Cluster Setup are both under Common. If we go with Owen's comment (keeping Single Node Setup and Cluster Setup under Common), then no change needs to be made to the Overview page. (2) If we go with Owen's comment (keeping Single Node Setup and Cluster Setup under Common), then the M/R cluster_setup.xml file may be more current than the Common cluster_setup.xml file. Thus, the M/R file needs to replace the Common file (and then be deleted from M/R).
          Hide
          Tom White added a comment -

          Here's one of two patches (the other is in HADOOP-6738), which moves cluster_setup.xml file to common (since as Corinne observed, the MR one is more up to date). This patch also moves commands_manual.xml from mapreduce to common, since it is closely related and makes sense to do at the same time.

          Show
          Tom White added a comment - Here's one of two patches (the other is in HADOOP-6738 ), which moves cluster_setup.xml file to common (since as Corinne observed, the MR one is more up to date). This patch also moves commands_manual.xml from mapreduce to common, since it is closely related and makes sense to do at the same time.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12445958/MAPREDUCE-1404.patch
          against trunk revision 949815.

          +1 @author. The patch does not contain any @author tags.

          +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/212/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/212/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/212/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/212/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12445958/MAPREDUCE-1404.patch against trunk revision 949815. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/212/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/212/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/212/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/212/console This message is automatically generated.
          Hide
          Hemanth Yamijala added a comment -

          This patch looks fine. I applied it, compiled docs and made sure that the links were all working fine. Some links have changed between trunk/0.21 and 0.20. For example, single-node-cluster was called quickstart in 0.20. Because there is no documentation published for anything later than 20, these give 404s right now. But I am assuming this will get corrected, once a release's documentation is published. Is this correct ?

          I also quickly checked if any other documentation needs to be moved. Hod sources are in common, but documentation is still in 0.21. Should we move the documentation to common ?

          Other changes are fine.

          Show
          Hemanth Yamijala added a comment - This patch looks fine. I applied it, compiled docs and made sure that the links were all working fine. Some links have changed between trunk/0.21 and 0.20. For example, single-node-cluster was called quickstart in 0.20. Because there is no documentation published for anything later than 20, these give 404s right now. But I am assuming this will get corrected, once a release's documentation is published. Is this correct ? I also quickly checked if any other documentation needs to be moved. Hod sources are in common, but documentation is still in 0.21. Should we move the documentation to common ? Other changes are fine.
          Hide
          Tom White added a comment -

          Thanks for the review, Hemanth. I've moved the HOD documentation page to common (see HADOOP-6738) per your suggestion. The 404s will go once the documentation for 21 is published to the site.

          I think this change is ready now. I'll commit it soon unless there are objections.

          Show
          Tom White added a comment - Thanks for the review, Hemanth. I've moved the HOD documentation page to common (see HADOOP-6738 ) per your suggestion. The 404s will go once the documentation for 21 is published to the site. I think this change is ready now. I'll commit it soon unless there are objections.
          Hide
          Hemanth Yamijala added a comment -

          Tom, the new changes look good and can be committed.

          I am planning to look at HADOOP-6738 as well, and will try to do so ASAP. But please don't hold back for this in case these commits are blocking you, since these are low risk changes anyway.

          Show
          Hemanth Yamijala added a comment - Tom, the new changes look good and can be committed. I am planning to look at HADOOP-6738 as well, and will try to do so ASAP. But please don't hold back for this in case these commits are blocking you, since these are low risk changes anyway.
          Hide
          Tom White added a comment -

          I've just committed this.

          Show
          Tom White added a comment - I've just committed this.

            People

            • Assignee:
              Tom White
              Reporter:
              Corinne Chandel
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development