Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.203.0
    • Fix Version/s: 0.20.204.0, 0.23.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Added RPM/DEB packages to build system.

      Description

      We should be able to create RPMs for Hadoop releases.

      1. HADOOP-6255-common-trunk-12.patch
        129 kB
        Eric Yang
      2. HADOOP-6255-branch-0.20-security-13.patch
        107 kB
        Eric Yang
      3. HADOOP-6255-branch-0.20-security-12.patch
        107 kB
        Eric Yang
      4. HADOOP-6255-common-trunk-11.patch
        129 kB
        Eric Yang
      5. HADOOP-6255-common-trunk-10.patch
        129 kB
        Eric Yang
      6. HADOOP-6255-common-trunk-9.patch
        129 kB
        Eric Yang
      7. HADOOP-6255-common-trunk-8.patch
        128 kB
        Eric Yang
      8. HADOOP-6255-branch-0.20-security-11.patch
        107 kB
        Eric Yang
      9. HADOOP-6255-common-trunk-7.patch
        126 kB
        Eric Yang
      10. HADOOP-6255-branch-0.20-security-10.patch
        106 kB
        Eric Yang
      11. HADOOP-6255-common-trunk-6.patch
        125 kB
        Eric Yang
      12. HADOOP-6255-common-trunk-5.patch
        125 kB
        Eric Yang
      13. HADOOP-6255-common-trunk-4.patch
        129 kB
        Eric Yang
      14. HADOOP-6255-branch-0.20-security-9.patch
        105 kB
        Eric Yang
      15. HADOOP-6255-mapred-trunk-2.patch
        121 kB
        Eric Yang
      16. HADOOP-6255-common-trunk-2.patch
        131 kB
        Eric Yang
      17. HADOOP-6255-mapred-trunk-1.patch
        120 kB
        Eric Yang
      18. HADOOP-6255-hdfs-trunk-1.patch
        73 kB
        Eric Yang
      19. HADOOP-6255-common-trunk-1.patch
        131 kB
        Eric Yang
      20. HADOOP-6255-mapred-trunk.patch
        120 kB
        Eric Yang
      21. HADOOP-6255-hdfs-trunk.patch
        73 kB
        Eric Yang
      22. HADOOP-6255-common-trunk.patch
        134 kB
        Eric Yang
      23. HADOOP-6255-branch-0.20-security-8.patch
        98 kB
        Eric Yang
      24. HADOOP-6255-branch-0.20-security-7.patch
        82 kB
        Eric Yang
      25. HADOOP-6255-branch-0.20-security-6.patch
        81 kB
        Eric Yang
      26. HADOOP-6255-branch-0.20-security-5.patch
        80 kB
        Eric Yang
      27. HADOOP-6255-branch-0.20-security-4.patch
        78 kB
        Eric Yang
      28. deployment.tex
        5 kB
        Owen O'Malley
      29. deployment.pdf
        75 kB
        Owen O'Malley
      30. HADOOP-6255-branch-0.20-security-3.patch
        78 kB
        Eric Yang
      31. HADOOP-6255-branch-0.20-security-2.patch
        78 kB
        Eric Yang
      32. HADOOP-6255-branch-0.20-security-1.patch
        78 kB
        Eric Yang
      33. HADOOP-6255-branch-0.20-security.patch
        69 kB
        Eric Yang
      34. HADOOP-6255.patch
        71 kB
        Eric Yang

        Issue Links

          Activity

          Hide
          FROHNER Ákos added a comment -

          Hi,

          I would suggest the other way around: create an RPM spec file,
          which uses a distribution tarball and calls the generic build.xml
          to build the hadoop packages.

          This way eases the adoption by upstream distributions, as they
          already have the framework to build packages from tarball+spec
          files (source RPM).

          And the same pattern can be used for Debian/Ubuntu packaging.

          Show
          FROHNER Ákos added a comment - Hi, I would suggest the other way around: create an RPM spec file, which uses a distribution tarball and calls the generic build.xml to build the hadoop packages. This way eases the adoption by upstream distributions, as they already have the framework to build packages from tarball+spec files (source RPM). And the same pattern can be used for Debian/Ubuntu packaging.
          Hide
          steve_l added a comment -
          1. This should be a separate project from the others, it's integration, and will soon get big.
          2. The project would be linux and OS/X (with rpmbuild installed) only. Even on linux, the right tools need to be installed
          3. Basic RPMs are easy, passing rpmlint harder
          4. Testing, that's the fun part.

          We test our RPMs by

          1. SCP to configured real/virtual machines. These are Centos 5.x VMs, usually hosted under VMWare. Under VirtualBox, RHEL5 and Centos spins one CPU at 100% (Virtualbox bug #1233)
          2. force-uninstalling any old versions, install the new ones.
          3. SSH in, walk the shell scripts through their entry points
              <target name="rpm-remote-initd"
                  depends="rpm-ready-to-remote-install,rpm-remote-install"
                  description="check that initd parses">
                <rootssh command="${remote-smartfrogd} start"/>
                <pause/>
                <rootssh command="${remote-smartfrogd} status"/>
                <pause/>
                <rootssh command="${remote-smartfrogd} start"/>
                <rootssh command="${remote-smartfrogd} status"/>
                <rootssh command="${remote-smartfrogd} stop"/>
                <rootssh command="${remote-smartfrogd} stop"/>
                <rootssh command="${remote-smartfrogd} restart"/>
                <pause/>
                <rootssh command="${remote-smartfrogd} status"/>
                <rootssh command="${remote-smartfrogd} restart"/>
                <pause/>
                <rootssh command="${remote-smartfrogd} status"/>
                <rootssh command="${remote-smartfrogd} stop"/>
              </target>
            
          4. run rpm -qf against various files, verify that they are owned. The RPM commands, executed remotely over SSH, are no fun to use in tests as you have to look for certain strings in the response; error codes are not used to signal failures. Ouch.
                <fail>
                  <condition>
                    <or>
                      <contains string="${rpm.queries.results}"
                          substring="is not owned by any package"/>
                      <contains string="${rpm.queries.results}"
                          substring="No such file or directory"/>
                    </or>
                  </condition>
                  One of the directories/files in the RPM is not declared as being owned by any RPM.
                  This file/directory will not be managed correctly, or have the correct permissions
                  on a hardened linux.
                  ${rpm.queries.results}
                </fail>
            

          For full functional testing, we also package up the test source trees as JAR files which are published via Ivy, so that the release/ project can retrieve those test files and point them (by way of java properties) at the remote machine. This is powerful as you can be sure that the RPM installations really do work as intended. If you only test the local machine, you miss out on problems.

          These tests don't verify all possible upgrades. They can be trouble as RPM installs the new files before uninstalling the old ones. Trouble.

          The other issue is configuration. You can either mark all configuration files as %config(noreplace), meaning people can edit them and upgrades won't stamp on them, or have a more structured process for managing conf files. Cloudera provide a web site to create a new configuration RPM, Apache could be provide a .tar.gz file which contains everything needed to create your own configuration RPM.

          Therefore + 1 to RPMs and debs

          1. In a separate package
          2. Named Apache-Hadoop. People out there are already releasing hadoop RPMs, we don't want confusion.
          3. With all config files in the RPM marked as %config) files, which end users can stamp on, or a separate roll-your-own-config RPM tool
          4. Once the tests are designed to run against remote systems, they should be run against the RPM installations.

          I don't volunteer to write the spec files or the build files, all mine are up to look at, and I will formally release them as Apache licensed if you want to use them as a starting point:
          http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/release/
          I could help with some of the functional testing now, provided it uses some of my real/virtual cluster management stuff to pick target hosts.

          Show
          steve_l added a comment - This should be a separate project from the others, it's integration, and will soon get big. The project would be linux and OS/X (with rpmbuild installed) only. Even on linux, the right tools need to be installed Basic RPMs are easy, passing rpmlint harder Testing, that's the fun part. We test our RPMs by SCP to configured real/virtual machines. These are Centos 5.x VMs, usually hosted under VMWare. Under VirtualBox, RHEL5 and Centos spins one CPU at 100% (Virtualbox bug #1233) force-uninstalling any old versions, install the new ones. SSH in, walk the shell scripts through their entry points <target name= "rpm-remote-initd" depends= "rpm-ready-to-remote-install,rpm-remote-install" description= "check that initd parses" > <rootssh command= "${remote-smartfrogd} start" /> <pause/> <rootssh command= "${remote-smartfrogd} status" /> <pause/> <rootssh command= "${remote-smartfrogd} start" /> <rootssh command= "${remote-smartfrogd} status" /> <rootssh command= "${remote-smartfrogd} stop" /> <rootssh command= "${remote-smartfrogd} stop" /> <rootssh command= "${remote-smartfrogd} restart" /> <pause/> <rootssh command= "${remote-smartfrogd} status" /> <rootssh command= "${remote-smartfrogd} restart" /> <pause/> <rootssh command= "${remote-smartfrogd} status" /> <rootssh command= "${remote-smartfrogd} stop" /> </target> run rpm -qf against various files, verify that they are owned. The RPM commands, executed remotely over SSH, are no fun to use in tests as you have to look for certain strings in the response; error codes are not used to signal failures. Ouch. <fail> <condition> <or> <contains string= "${rpm.queries.results}" substring= "is not owned by any package " /> <contains string= "${rpm.queries.results}" substring= "No such file or directory" /> </or> </condition> One of the directories/files in the RPM is not declared as being owned by any RPM. This file/directory will not be managed correctly, or have the correct permissions on a hardened linux. ${rpm.queries.results} </fail> For full functional testing, we also package up the test source trees as JAR files which are published via Ivy, so that the release/ project can retrieve those test files and point them (by way of java properties) at the remote machine. This is powerful as you can be sure that the RPM installations really do work as intended. If you only test the local machine, you miss out on problems. These tests don't verify all possible upgrades. They can be trouble as RPM installs the new files before uninstalling the old ones. Trouble. The other issue is configuration. You can either mark all configuration files as %config(noreplace) , meaning people can edit them and upgrades won't stamp on them, or have a more structured process for managing conf files. Cloudera provide a web site to create a new configuration RPM, Apache could be provide a .tar.gz file which contains everything needed to create your own configuration RPM. Therefore + 1 to RPMs and debs In a separate package Named Apache-Hadoop. People out there are already releasing hadoop RPMs, we don't want confusion. With all config files in the RPM marked as %config) files, which end users can stamp on, or a separate roll-your-own-config RPM tool Once the tests are designed to run against remote systems, they should be run against the RPM installations. I don't volunteer to write the spec files or the build files, all mine are up to look at, and I will formally release them as Apache licensed if you want to use them as a starting point: http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/release/ I could help with some of the functional testing now, provided it uses some of my real/virtual cluster management stuff to pick target hosts.
          Hide
          steve_l added a comment -

          I should add that I do include my own Hadoop jars in my RPMs, and that these RPMs are what get installed in machine images (real or virtual) that are then used for all the cluster based testing. Because if you are going to distribute your artifacts as RPMs, that's how you should test your code. Once you've automated RPM installation and the creation of gold-VM images for your target VM infrastructure (VMWare, Xen, EC2, etc), then you can worry about cluster scale testing of the artifacts.

          Show
          steve_l added a comment - I should add that I do include my own Hadoop jars in my RPMs, and that these RPMs are what get installed in machine images (real or virtual) that are then used for all the cluster based testing. Because if you are going to distribute your artifacts as RPMs, that's how you should test your code. Once you've automated RPM installation and the creation of gold-VM images for your target VM infrastructure (VMWare, Xen, EC2, etc), then you can worry about cluster scale testing of the artifacts.
          Hide
          steve_l added a comment -

          In HADOOP-3835 dhruba proposed an RPM target in the build. "minor" improvement. This issue is later but it has more watchers, so I'm going to close Dhruba's issue as duplicate -even though his came first.

          Show
          steve_l added a comment - In HADOOP-3835 dhruba proposed an RPM target in the build. "minor" improvement. This issue is later but it has more watchers, so I'm going to close Dhruba's issue as duplicate -even though his came first.
          Hide
          steve_l added a comment -

          HADOOP-5615 includes some spec files; these could be a starting point for something targeting 0.22+ (with all of avro's jars too).

          Show
          steve_l added a comment - HADOOP-5615 includes some spec files; these could be a starting point for something targeting 0.22+ (with all of avro's jars too).
          Hide
          Steve Loughran added a comment -

          changing the title

          Show
          Steve Loughran added a comment - changing the title
          Hide
          Steve Loughran added a comment -

          Here is how I propose doing this

          1. Have a subproject that uses Ivy to pull in what is needed
          2. Start with the spec files of HADOOP-5615 updated with any changes needed for 0.21
          3. Remove the requirement for sun-java, maybe add an RPM to make that dependency explicit as an option for people who need it and don't want openjdk/jrockit jvms.
          4. Build file creates the RPMs of all the JARs etc, with configuration a separate RPM.

          Troublespots:

          1. configuration. Initial cut take the conf dir and create the RPMs for it.
          2. native binaries. I've never done native RPMs, don't know where to begin.
          3. Upgrades and testing thereof. where it gets real fun is when you think about FS upgrades with RPM upgrades
          4. Automated testing. Help needed, we'd like this to work under hudson too

          The goal here is to have a basic Apache Hadoop RPM set, something we can start off with in 0.21 beta tests to see how they work.

          Show
          Steve Loughran added a comment - Here is how I propose doing this Have a subproject that uses Ivy to pull in what is needed Start with the spec files of HADOOP-5615 updated with any changes needed for 0.21 Remove the requirement for sun-java, maybe add an RPM to make that dependency explicit as an option for people who need it and don't want openjdk/jrockit jvms. Build file creates the RPMs of all the JARs etc, with configuration a separate RPM. Troublespots: configuration. Initial cut take the conf dir and create the RPMs for it. native binaries. I've never done native RPMs, don't know where to begin. Upgrades and testing thereof. where it gets real fun is when you think about FS upgrades with RPM upgrades Automated testing. Help needed, we'd like this to work under hudson too The goal here is to have a basic Apache Hadoop RPM set, something we can start off with in 0.21 beta tests to see how they work.
          Hide
          Allen Wittenauer added a comment -

          > native binaries. I've never done native RPMs, don't know where to begin.

          We'll basically need an RPM per-architecture:

          • noarch for the pure java, shell, etc bits
          • i586 for the 32-bit compiled stuff
          • x86_64 for the 64-bit compiled stuff

          I suspect we'll also need this broken up by project. i.e.:

          hadoop-X.X.X-common-X.X.X.noarch, hadoop-X.X.X-hdfs-X.X.X.noarch, hadoop-X.X.X-mapred-X.X.X.noarch,
          hadoop-X.X.X-hdfs-X.X.X.i586, hadoop-X.X.X-hdfs-X.X.X.x86_64,
          hadoop-X.X.X-mapred-X.X.X.i586, hadoop-X.X.X-mapred-X.X.X.x86_64, where X.X.X is the version number. [Need to package per-version RPMs so that upgrades are tolerable. :( ]

          Show
          Allen Wittenauer added a comment - > native binaries. I've never done native RPMs, don't know where to begin. We'll basically need an RPM per-architecture: noarch for the pure java, shell, etc bits i586 for the 32-bit compiled stuff x86_64 for the 64-bit compiled stuff I suspect we'll also need this broken up by project. i.e.: hadoop-X.X.X-common-X.X.X.noarch, hadoop-X.X.X-hdfs-X.X.X.noarch, hadoop-X.X.X-mapred-X.X.X.noarch, hadoop-X.X.X-hdfs-X.X.X.i586, hadoop-X.X.X-hdfs-X.X.X.x86_64, hadoop-X.X.X-mapred-X.X.X.i586, hadoop-X.X.X-mapred-X.X.X.x86_64, where X.X.X is the version number. [Need to package per-version RPMs so that upgrades are tolerable. :( ]
          Hide
          Steve Loughran added a comment -

          OK.

          1. Presumably we'd have a hadoop-avro too, or would that start off in hadoop-common.noarch
          2. For conf, I think we could pre-generate a few example configurations hadoop-conf-standalone, and I'm assuming one conf RPM for everything instead of a common-conf, hdfs-conf and mapred-conf.
          3. We should also redist the .tar file needed for someone on a unix with rpmbuild installed to create their own conf RPMs. That's what I effectively do in Smartfrog, where the configuration also tells the runtime what services to deploy on startup. That way, anyone is free to create the own -conf RPM from a local set of files and push it out to the cluster.

          How to start this -is there a bit of SVN where we can start to prototype something?

          Show
          Steve Loughran added a comment - OK. Presumably we'd have a hadoop-avro too, or would that start off in hadoop-common.noarch For conf, I think we could pre-generate a few example configurations hadoop-conf-standalone , and I'm assuming one conf RPM for everything instead of a common-conf, hdfs-conf and mapred-conf. We should also redist the .tar file needed for someone on a unix with rpmbuild installed to create their own conf RPMs. That's what I effectively do in Smartfrog, where the configuration also tells the runtime what services to deploy on startup. That way, anyone is free to create the own -conf RPM from a local set of files and push it out to the cluster. How to start this -is there a bit of SVN where we can start to prototype something?
          Hide
          Owen O'Malley added a comment -

          Avro is just another dependency. Until it starts generating rpms, I suggest that we just include all of the dependencies in the rpm for hadoop-common.noarch.

          We also need to separate the client from the server rpms, since it would be really nice to be able to install the client code with out root, but the server-side rpms require root. (To install the setuid task controller...)

          Show
          Owen O'Malley added a comment - Avro is just another dependency. Until it starts generating rpms, I suggest that we just include all of the dependencies in the rpm for hadoop-common.noarch. We also need to separate the client from the server rpms, since it would be really nice to be able to install the client code with out root, but the server-side rpms require root. (To install the setuid task controller...)
          Hide
          Konstantin Boudnik added a comment -

          It seems like people here are fond of having a separate project from the Hadoop itself. While it seems like good idea I can see slight different approach. Here it is:

          • native packaging specs (DEBs, RPMs, whatnot) are placed within the current build systems i.e. under $project.root/packaging/specs
          • existing build.xml is extended with an extra target create-packages depends on perhaps tar
          • an execution of
            ant create-packages -Dpackage.type=RPM|DEB -Dpackage.arch=noarch|x32|x64|ARM
            will exec a package creation script from $project.root/packaging/scripts using spec specified by package.type
          • similarly, test packages can be produced by
            ant create-packages -Dpackage.type=RPM|DEB -Dpackage.class=test

          IMO this approach should allow for reuse of some of existing build functionality without adding any extra hassle to the current build system. Besides, the packaging will be kept as a part of the project itself however will be physically separated from the build system.

          Any dependencies resolvable as packages of certain type needs to be listed as external dependencies. Non-resolvable (like Avro above) will have to be included.

          We also need to separate the client from the server rpms

          +1 on this one.

          Show
          Konstantin Boudnik added a comment - It seems like people here are fond of having a separate project from the Hadoop itself. While it seems like good idea I can see slight different approach. Here it is: native packaging specs (DEBs, RPMs, whatnot) are placed within the current build systems i.e. under $project.root/packaging/specs existing build.xml is extended with an extra target create-packages depends on perhaps tar an execution of ant create-packages -Dpackage.type=RPM|DEB -Dpackage.arch=noarch|x32|x64|ARM will exec a package creation script from $project.root/packaging/scripts using spec specified by package.type similarly, test packages can be produced by ant create-packages -Dpackage.type=RPM|DEB -Dpackage.class=test IMO this approach should allow for reuse of some of existing build functionality without adding any extra hassle to the current build system. Besides, the packaging will be kept as a part of the project itself however will be physically separated from the build system. Any dependencies resolvable as packages of certain type needs to be listed as external dependencies. Non-resolvable (like Avro above) will have to be included. We also need to separate the client from the server rpms +1 on this one.
          Hide
          Konstantin Boudnik added a comment -

          After a good discussion with Owen and Roman here's the a better proposal:

          • new top level targets to be introduced into ant build:
            • package-32
            • package-64
            • package-noarch
            • package depends on all above

          It is up to build to find out about the type of OS the build is running on and either to locate appropriate packaging script and schema or fail with appropriate diagnostics.

          The preliminary structure of installed packages is like this:

          $root/
            bin/
              hadoop
              hadoop-daemon?.sh
              hdfs
              mapred
              <other user facing scripts>
            share/
              hadoop/
                bin/
                  hadoop-config.sh
                lib/
                  *.jar
                man/
                include/
                  c++/
                sbin/
                  jvsc
                  taskcontroller
                  runAs (for Herriot packages)
          

          Some notes:

          • jar files in share/hadoop/lib/ have to have their owners based on the components they are coming from (e.g. hdfs-client, hdfs-server, etc.)
          • packages required by Hadoop but aren't included into its source code (LZO is a good example) shall be delivered via inter-package dependencies.

          Something has to be done about configs. Shall they be placed under /etc/hadoop perhaps?

          Herriot package (test for real cluster as in HADOOP-6332) shall be created separately because it requires byte-code instrumentation.

          Show
          Konstantin Boudnik added a comment - After a good discussion with Owen and Roman here's the a better proposal: new top level targets to be introduced into ant build: package-32 package-64 package-noarch package depends on all above It is up to build to find out about the type of OS the build is running on and either to locate appropriate packaging script and schema or fail with appropriate diagnostics. The preliminary structure of installed packages is like this: $root/ bin/ hadoop hadoop-daemon?.sh hdfs mapred <other user facing scripts> share/ hadoop/ bin/ hadoop-config.sh lib/ *.jar man/ include/ c++/ sbin/ jvsc taskcontroller runAs (for Herriot packages) Some notes: jar files in share/hadoop/lib/ have to have their owners based on the components they are coming from (e.g. hdfs-client, hdfs-server, etc.) packages required by Hadoop but aren't included into its source code (LZO is a good example) shall be delivered via inter-package dependencies. Something has to be done about configs. Shall they be placed under /etc/hadoop perhaps? Herriot package (test for real cluster as in HADOOP-6332 ) shall be created separately because it requires byte-code instrumentation.
          Hide
          Allen Wittenauer added a comment -

          Are we trying to build a package that fits the local OS or are we trying to build a package that has hadoop completely in one dir and is fairly generic?

          If the former, trying to dictate where things like config files are installed is going to break. Every OS has fairly specific rules and expectations (Linux has LSB, Solaris has filesystems(5), NeXTStep... I mean, OS X, has something documented somewhere, I'm sure, etc...). If the latter, then I'd make the following changes:

          a) drop the share level. it doesn't seem to serve a purpose

          b) etc/ should contain configs

          c) lib should contain hadoop-config.sh and *.so in addtion to *.jar

          d) i take it hdfs/mapred/etc is a 0.21 thing? are we concerned about the number of people that have 'hdfs' aliased to 'hadoop dfs'? or is it a replacement for that?

          e) var/tmp and var/logs should be defined

          Show
          Allen Wittenauer added a comment - Are we trying to build a package that fits the local OS or are we trying to build a package that has hadoop completely in one dir and is fairly generic? If the former, trying to dictate where things like config files are installed is going to break. Every OS has fairly specific rules and expectations (Linux has LSB, Solaris has filesystems(5), NeXTStep... I mean, OS X, has something documented somewhere, I'm sure, etc...). If the latter, then I'd make the following changes: a) drop the share level. it doesn't seem to serve a purpose b) etc/ should contain configs c) lib should contain hadoop-config.sh and *.so in addtion to *.jar d) i take it hdfs/mapred/etc is a 0.21 thing? are we concerned about the number of people that have 'hdfs' aliased to 'hadoop dfs'? or is it a replacement for that? e) var/tmp and var/logs should be defined
          Hide
          Steve Loughran added a comment -

          I've been pushing for this to be downstream of the initial tar process as you may want to let people build their own RPMs, with their own config files. The tar file to create the RPMs is a redistributable all of its own. If I understood source RPMs, they would probably fit into the story somehow too.

          Show
          Steve Loughran added a comment - I've been pushing for this to be downstream of the initial tar process as you may want to let people build their own RPMs, with their own config files. The tar file to create the RPMs is a redistributable all of its own. If I understood source RPMs, they would probably fit into the story somehow too.
          Hide
          Allen Wittenauer added a comment -

          After a night of sleep, the following occurred to me:

          a) the stuff in sbin should actually be in libexec, along with hadoop-config.sh
          b) hdfs, mapred, and hadoop-dameon.sh (assuming those are all admin tools) should be in sbin
          c) why include/c++ ? shouldn't it just be include?

          Show
          Allen Wittenauer added a comment - After a night of sleep, the following occurred to me: a) the stuff in sbin should actually be in libexec, along with hadoop-config.sh b) hdfs, mapred, and hadoop-dameon.sh (assuming those are all admin tools) should be in sbin c) why include/c++ ? shouldn't it just be include?
          Hide
          Roman Shaposhnik added a comment -

          I have two points to add to the discussion:
          1. I'm wondering whether it would be useful to slice it a bit more thinly. IOW, introducing the notion of these extra
          top level targets available for packaging:
          hadoop-core
          hadoop-client
          hadoop-daemon
          hadoop-devel
          hadoop-javadoc

          2. As for configs, I'd like to point out an example that Debian has established with their packaging of .20. Basically
          they created one package per node type (http://packages.qa.debian.org/h/hadoop.html) plus one package common
          among all the daemons:
          hadoop-daemons-common
          hadoop-jobtrackerd
          hadoop-tasktrackerd
          hadoop-datanoded
          hadoop-namenoded
          hadoop-secondarynamenoded

          The packages themselves are pretty slim – containing only hooks to make daemons plug into the service management
          system (init.d in Debian's case, but one would imagine Solaris/SMF or anything like that also being an option for us).
          I also would tend to believe that these could be reasonable packages to be used for splitting the configs appropriately.

          Show
          Roman Shaposhnik added a comment - I have two points to add to the discussion: 1. I'm wondering whether it would be useful to slice it a bit more thinly. IOW, introducing the notion of these extra top level targets available for packaging: hadoop-core hadoop-client hadoop-daemon hadoop-devel hadoop-javadoc 2. As for configs, I'd like to point out an example that Debian has established with their packaging of .20. Basically they created one package per node type ( http://packages.qa.debian.org/h/hadoop.html ) plus one package common among all the daemons: hadoop-daemons-common hadoop-jobtrackerd hadoop-tasktrackerd hadoop-datanoded hadoop-namenoded hadoop-secondarynamenoded The packages themselves are pretty slim – containing only hooks to make daemons plug into the service management system (init.d in Debian's case, but one would imagine Solaris/SMF or anything like that also being an option for us). I also would tend to believe that these could be reasonable packages to be used for splitting the configs appropriately.
          Hide
          Konstantin Boudnik added a comment -

          a) the stuff in sbin should actually be in libexec, along with hadoop-config.sh

          Perhaps for the current sbin/ stuff it a better place. But why hadoop-config.sh I can see the latter to be in sbin/ though

          b) hdfs, mapred, and hadoop-dameon.sh (assuming those are all admin tools) should be in sbin

          In a sense I agree - they are admin tools therefore need to be in sbin. And also they aren't likely to be invoked directly but rather through distro specific start/stop facilities e.g. init.d or service. So, perhaps they are indeed belong to sbin/

          c) why include/c++ ? shouldn't it just be include?

          Agree, I guess c++ was an example of the stuff me need to put into include/

          Show
          Konstantin Boudnik added a comment - a) the stuff in sbin should actually be in libexec, along with hadoop-config.sh Perhaps for the current sbin/ stuff it a better place. But why hadoop-config.sh I can see the latter to be in sbin/ though b) hdfs, mapred, and hadoop-dameon.sh (assuming those are all admin tools) should be in sbin In a sense I agree - they are admin tools therefore need to be in sbin . And also they aren't likely to be invoked directly but rather through distro specific start/stop facilities e.g. init.d or service . So, perhaps they are indeed belong to sbin/ c) why include/c++ ? shouldn't it just be include? Agree, I guess c++ was an example of the stuff me need to put into include/
          Hide
          Allen Wittenauer added a comment -

          I might be wrong, but I'm pretty certain that hadoop-config.sh isn't meant to be directly user-runnable. It falls into the same category as the taskController.

          Show
          Allen Wittenauer added a comment - I might be wrong, but I'm pretty certain that hadoop-config.sh isn't meant to be directly user-runnable. It falls into the same category as the taskController.
          Hide
          Eric Yang added a comment -

          Patch for creating rpm or deb pakcage.

          Show
          Eric Yang added a comment - Patch for creating rpm or deb pakcage.
          Hide
          Eric Yang added a comment -

          This patch applies to branch-0.20-security-patches.

          Usage to build RPM packages for Redhat/CentOS:

          ant rpm -Djava5.home=... -Dforrest.home=... -Dlibhdfs=1 -Dcompile.c++=true -Dcompile.native=true

          Usage to build DEB package for Ubuntu:

          ant deb -Djava5.home=... -Dforrest.home=... -Dlibhdfs=1 -Dcompile.c++=true -Dcompile.native=true

          The generated packages are as follow:

          hadoop-[version].[arch].rpm
          hadoop-conf-pseudo-[version].[arch].rpm
          hadoop-[version].src.rpm
          hadoop_[version]_[arch].deb
          hadoop-conf-pseudo_[version]_[arch].deb
          

          The RPM package is designed to be relocatable. User can install it with:
          rpm i hadoop[version].[arch].rpm --relocate /usr=/usr/local

          The relocatable directories are:

          /etc/hadoop
          /var/log/hadoop
          /var/run/hadoop
          /usr
          

          See Owen's proposal for directory layout.

          Show
          Eric Yang added a comment - This patch applies to branch-0.20-security-patches. Usage to build RPM packages for Redhat/CentOS: ant rpm -Djava5.home=... -Dforrest.home=... -Dlibhdfs=1 -Dcompile.c++=true -Dcompile.native=true Usage to build DEB package for Ubuntu: ant deb -Djava5.home=... -Dforrest.home=... -Dlibhdfs=1 -Dcompile.c++=true -Dcompile.native=true The generated packages are as follow: hadoop-[version].[arch].rpm hadoop-conf-pseudo-[version].[arch].rpm hadoop-[version].src.rpm hadoop_[version]_[arch].deb hadoop-conf-pseudo_[version]_[arch].deb The RPM package is designed to be relocatable. User can install it with: rpm i hadoop [version] . [arch] .rpm --relocate /usr=/usr/local The relocatable directories are: /etc/hadoop /var/log/hadoop /var/run/hadoop /usr See Owen's proposal for directory layout.
          Hide
          Eric Yang added a comment -

          Update proposal for file location.

          Show
          Eric Yang added a comment - Update proposal for file location.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12470823/deployment.pdf
          against trunk revision 1069677.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/226//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12470823/deployment.pdf against trunk revision 1069677. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/226//console This message is automatically generated.
          Hide
          Allen Wittenauer added a comment -

          Looking at the pdf, a few comments/questions.

          • The document should give a solid description of each directory.
          • The directory description should also be put either on the wiki or on the main hadoop site to provide guidance when we add new functionality as to where the content should go
          • Is var/lib/data-node meant to be HDFS storage?
          • Is var/lib/task-tracker meant to be MR spill space? Why not var/tmp/something to reflect that it is transient data?
          • Do the other services get custom log dirs in var/log? How does that impact permissions when Hadoop runs tasks as multiple users?
          • Why are we putting jar files in share instead of somewhere in lib? Doesn't that violate the "Libraries that aren't loaded via System.LoadLibrary" rule?
          • Where are we putting the default xml files as a reference?
          • Where do the various 'helper' scripts that are currently in bin go? (hadoop-env.sh, etc)
          Show
          Allen Wittenauer added a comment - Looking at the pdf, a few comments/questions. The document should give a solid description of each directory. The directory description should also be put either on the wiki or on the main hadoop site to provide guidance when we add new functionality as to where the content should go Is var/lib/data-node meant to be HDFS storage? Is var/lib/task-tracker meant to be MR spill space? Why not var/tmp/something to reflect that it is transient data? Do the other services get custom log dirs in var/log? How does that impact permissions when Hadoop runs tasks as multiple users? Why are we putting jar files in share instead of somewhere in lib? Doesn't that violate the "Libraries that aren't loaded via System.LoadLibrary" rule? Where are we putting the default xml files as a reference? Where do the various 'helper' scripts that are currently in bin go? (hadoop-env.sh, etc)
          Hide
          Eric Yang added a comment -
          • Yes, $PREFIX/var/lib/data-node is meant for HDFS storage.
          • Yes, $PREFIX/var/lib/task-tracker is for MR spill space.
          • The location are configurable by Hadoop Configuration. The conf-pseudo rpm is design as a single node deployment, which put data in /var/lib/hdfs/data-node, and /var/lib/mapred/task-tracker respectively.
          • Yes, other services get custom log dir in var/log. Userlogs are stored as a subdirectory as /var/log/hadoop/mapred/userlogs by the single node config rpm. It is configurable by changing the default hadoop-env.sh.
          • Stuff in lib are usually meant for C libraries, hence Owen suggested to put jar files in share.
          • The default configuration goes to /usr/share/hadoop/conf and /etc/hadoop is symlink to conf.
          • hadoop-env.sh is in /usr/share/hadoop/conf, and /etc/defaults/hadoop-env.sh is symlinked to conf/hadoop-env.sh.
          Show
          Eric Yang added a comment - Yes, $PREFIX/var/lib/data-node is meant for HDFS storage. Yes, $PREFIX/var/lib/task-tracker is for MR spill space. The location are configurable by Hadoop Configuration. The conf-pseudo rpm is design as a single node deployment, which put data in /var/lib/hdfs/data-node, and /var/lib/mapred/task-tracker respectively. Yes, other services get custom log dir in var/log. Userlogs are stored as a subdirectory as /var/log/hadoop/mapred/userlogs by the single node config rpm. It is configurable by changing the default hadoop-env.sh. Stuff in lib are usually meant for C libraries, hence Owen suggested to put jar files in share. The default configuration goes to /usr/share/hadoop/conf and /etc/hadoop is symlink to conf. hadoop-env.sh is in /usr/share/hadoop/conf, and /etc/defaults/hadoop-env.sh is symlinked to conf/hadoop-env.sh.
          Hide
          Allen Wittenauer added a comment -

          > Yes, $PREFIX/var/lib/task-tracker is for MR spill space.

          I'm slightly disturbed by the connotation that var/lib carries here. The TT (and MR framework as a whole) is pretty horrific at disk space management and requires ops teams to intervene quite a bit. I feel as though if we make it var/lib, ops teams will be more hesitant to clean it out.

          > Stuff in lib are usually meant for C libraries, hence Owen suggested to put jar files in share.

          Someone needs to tell perl, python, ruby, tcl/tk, ... that they shouldn't be putting stuff there then.

          In the case of all those languages, they create a /usr/lib/(language) dir and then start filling it with appropriate stuff. There is also plenty of precedent for application specific libraries being in /usr/lib/(application). Jar files are as close to C .so's as Java gets. It makes more sense to me to put them in lib.

          > The default configuration goes to /usr/share/hadoop/conf and /etc/hadoop is symlink to conf

          Do you mean that /etc/hadoop -> /usr/share/hadoop/conf? If so, that is very wrong. /etc/hadoop shouldn't be a symlink to anything and is the place where the human editable files are at.

          Show
          Allen Wittenauer added a comment - > Yes, $PREFIX/var/lib/task-tracker is for MR spill space. I'm slightly disturbed by the connotation that var/lib carries here. The TT (and MR framework as a whole) is pretty horrific at disk space management and requires ops teams to intervene quite a bit. I feel as though if we make it var/lib, ops teams will be more hesitant to clean it out. > Stuff in lib are usually meant for C libraries, hence Owen suggested to put jar files in share. Someone needs to tell perl, python, ruby, tcl/tk, ... that they shouldn't be putting stuff there then. In the case of all those languages, they create a /usr/lib/(language) dir and then start filling it with appropriate stuff. There is also plenty of precedent for application specific libraries being in /usr/lib/(application). Jar files are as close to C .so's as Java gets. It makes more sense to me to put them in lib. > The default configuration goes to /usr/share/hadoop/conf and /etc/hadoop is symlink to conf Do you mean that /etc/hadoop -> /usr/share/hadoop/conf? If so, that is very wrong. /etc/hadoop shouldn't be a symlink to anything and is the place where the human editable files are at.
          Hide
          Eric Yang added a comment -

          $PREFIX/var/lib is configurable by xml configuration. User has the full control of where to put this directory for multi-node deployment. For single node deployment, I put in /var/lib/hadoop for good practice.

          For $PREFIX/lib, the native hadoop c libraries goes here. For now, $PREFIX/share/hadoop is chosen for backward compatibility reason of HADOOP_HOME. Once we moved to maven and have separated jar file for each module, it will be easier to modularize and put jar files into /usr/lib/hadoop or /usr/lib/java/hadoop.

          I chose to symlink /etc/hadoop to /usr/share/hadoop/conf for backward compatibility reason of HADOOP_HOME. For the same reason that /etc/grub.conf is symlinked to /boot/grub/grub.conf. I could make the packaging system do the reverse, if hadoop community prefers this.

          Show
          Eric Yang added a comment - $PREFIX/var/lib is configurable by xml configuration. User has the full control of where to put this directory for multi-node deployment. For single node deployment, I put in /var/lib/hadoop for good practice. For $PREFIX/lib, the native hadoop c libraries goes here. For now, $PREFIX/share/hadoop is chosen for backward compatibility reason of HADOOP_HOME. Once we moved to maven and have separated jar file for each module, it will be easier to modularize and put jar files into /usr/lib/hadoop or /usr/lib/java/hadoop. I chose to symlink /etc/hadoop to /usr/share/hadoop/conf for backward compatibility reason of HADOOP_HOME. For the same reason that /etc/grub.conf is symlinked to /boot/grub/grub.conf. I could make the packaging system do the reverse, if hadoop community prefers this.
          Hide
          Allen Wittenauer added a comment -

          Have we actually had an Apache release of hadoop that shipped jars and conf in share? If not, then the backward compatibility argument isn't valid.

          Show
          Allen Wittenauer added a comment - Have we actually had an Apache release of hadoop that shipped jars and conf in share? If not, then the backward compatibility argument isn't valid.
          Hide
          Eric Yang added a comment -

          Allen, what location would you recommend for HADOOP_HOME in /usr path?

          Show
          Eric Yang added a comment - Allen, what location would you recommend for HADOOP_HOME in /usr path?
          Hide
          Eric Yang added a comment -

          Update patch for Branch-0.20-security.

          Show
          Eric Yang added a comment - Update patch for Branch-0.20-security.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12470917/HADOOP-6255-branch-0.20-security.patch
          against trunk revision 1069677.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/228//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12470917/HADOOP-6255-branch-0.20-security.patch against trunk revision 1069677. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/228//console This message is automatically generated.
          Hide
          Nigel Daley added a comment -

          +1 for documenting this layout on wiki.

          Show
          Nigel Daley added a comment - +1 for documenting this layout on wiki.
          Hide
          Steve Loughran added a comment -

          1. Logs are trouble, you can create a lot of them and people need to know they shouldn't go on the root disk. As nigel says, this should be wiki'd

          2. You can test RPMs, though not directly at the JUnit level. To test them you need to bring up a VM with the target OS in a known state, copy over the RPMs, install them, start Hadoop. Even without that, running rpmbuild as part of the nightly releases helps catch problems sooner rather than later.

          3. Which OS is targeted? RHEL/CentOS 5.5? I have a Virtualbox VM for that.

          4. Even if there aren't any .deb files created right now, it'd be good to make sure the layout is compatible with the debian team rules

          You don't need RHEL to do the builds, any Unix with rpmbuild works. Not OS/X, I've tried that before and found too many problems with the rpmbuild version and with the shell: http://jira.smartfrog.org/jira/browse/SFOS-1255

          Show
          Steve Loughran added a comment - 1. Logs are trouble, you can create a lot of them and people need to know they shouldn't go on the root disk. As nigel says, this should be wiki'd 2. You can test RPMs, though not directly at the JUnit level. To test them you need to bring up a VM with the target OS in a known state, copy over the RPMs, install them, start Hadoop. Even without that, running rpmbuild as part of the nightly releases helps catch problems sooner rather than later. 3. Which OS is targeted? RHEL/CentOS 5.5? I have a Virtualbox VM for that. 4. Even if there aren't any .deb files created right now, it'd be good to make sure the layout is compatible with the debian team rules You don't need RHEL to do the builds, any Unix with rpmbuild works. Not OS/X, I've tried that before and found too many problems with the rpmbuild version and with the shell: http://jira.smartfrog.org/jira/browse/SFOS-1255
          Hide
          Allen Wittenauer added a comment -

          > Allen, what location would you recommend for HADOOP_HOME in /usr path?

          That's sort of the point of standardized locations. HADOOP_HOME becomes superfluous. In fact, with a working $

          {BASH_SOURCE-0}

          and a few other changes in the shell commands, you can run Hadoop 0.20 without setting $HADOOP_HOME/$HADOOP_CONF_DIR because hadoop-config.sh works properly.

          What should likely happen as part of this change, is hadoop-config.sh is modified to honor any traditional HADOOP_HOME and HADOOP_CONF_DIR passed options and env settings. If neither exist, go look at the places defined in the packages. Order should be something like:

          1) --config location
          2) HADOOP_CONF_DIR location
          3) HADOOP_HOME/etc by new rules or if you want to be backward compat, HADOOP_HOME/conf
          4) /etc/hadoop

          Doing the above allows for a non-traditional location by setting HADOOP_HOME/HADOOP_CONF_DIR just as one would do so today.

          The big question is whether or not you want to honor $

          {BASH_SOURCE}/../etc or ${BASH_SOURCE}

          /../conf similar to how it works in the current releases. My thought is no, as we provide more than enough hooks to override with just the four above.

          One of the tricky issues is what to do about hadoop-config.sh itself. The "new standard" is typically to compile in the logic into commands and then provide an "app-config" shell script in /usr/bin that provides the well known locations to external apps (which is really the only "new" bit). This is probably the approach we should take. At build time, hadoop-config.sh or equiv gets sucked into the various shell commands including a new hadoop-config executable so that HBase and others get the info they need to execute properly. Another choice is to include hooks into pkgconfig, but I don't think that's as fluid as we really want here.

          Show
          Allen Wittenauer added a comment - > Allen, what location would you recommend for HADOOP_HOME in /usr path? That's sort of the point of standardized locations. HADOOP_HOME becomes superfluous. In fact, with a working $ {BASH_SOURCE-0} and a few other changes in the shell commands, you can run Hadoop 0.20 without setting $HADOOP_HOME/$HADOOP_CONF_DIR because hadoop-config.sh works properly. What should likely happen as part of this change, is hadoop-config.sh is modified to honor any traditional HADOOP_HOME and HADOOP_CONF_DIR passed options and env settings. If neither exist, go look at the places defined in the packages. Order should be something like: 1) --config location 2) HADOOP_CONF_DIR location 3) HADOOP_HOME/etc by new rules or if you want to be backward compat, HADOOP_HOME/conf 4) /etc/hadoop Doing the above allows for a non-traditional location by setting HADOOP_HOME/HADOOP_CONF_DIR just as one would do so today. The big question is whether or not you want to honor $ {BASH_SOURCE}/../etc or ${BASH_SOURCE} /../conf similar to how it works in the current releases. My thought is no, as we provide more than enough hooks to override with just the four above. One of the tricky issues is what to do about hadoop-config.sh itself. The "new standard" is typically to compile in the logic into commands and then provide an "app-config" shell script in /usr/bin that provides the well known locations to external apps (which is really the only "new" bit). This is probably the approach we should take. At build time, hadoop-config.sh or equiv gets sucked into the various shell commands including a new hadoop-config executable so that HBase and others get the info they need to execute properly. Another choice is to include hooks into pkgconfig, but I don't think that's as fluid as we really want here.
          Hide
          Allen Wittenauer added a comment -

          > Even if there aren't any .deb files created right now, it'd be good to
          > make sure the layout is compatible with the debian team rules

          This is a great reason to make hadoop-config pluggable (at build time?) for anyone that wants to enforce OS-specific rules. For example, I know the current layout runs very much afoul of SVRx guidelines which dictates all of this should have /opt sprinkled here and there.

          Show
          Allen Wittenauer added a comment - > Even if there aren't any .deb files created right now, it'd be good to > make sure the layout is compatible with the debian team rules This is a great reason to make hadoop-config pluggable (at build time?) for anyone that wants to enforce OS-specific rules. For example, I know the current layout runs very much afoul of SVRx guidelines which dictates all of this should have /opt sprinkled here and there.
          Hide
          Eric Yang added a comment -

          Steve,
          1. The RPM is relocatable for /var/log/hadoop to anywhere. During installation, the RPM install phase, update-hadoop-env.sh will adjust log directory to user specified directory. For example:

          rpm -i hadoop-[version].[arch].rpm \
            --relocate /usr=/usr/local/hadoop-0.20.100 \
            --relocate /var/log=/usr/local/hadoop-0.20.100/logs \
            --relocate /etc/hadoop=/usr/local/hadoop-0.20.100/conf
          

          2. For now, rpmbuild should be sufficient. After hadoop switching to maven, it will be more valuable to have rpm testing in the integration test phase.

          3. This should work in both RHEL/CentOS 5.5. This has been tested on RHEL 5.1, 5.5.

          4. .deb built should be cross platform (tested on RHEL 5.1), but it requires jdeb. Did you run "ant deb" instead of "ant rpm"?

          Allen,

          The current design allows have configuration directory locatable. Example:

          rpm -i hadoop-[version].[arch].rpm \
            --relocate /usr=/usr/local/hadoop \
            --relocate /etc/hadoop=/usr/local/etc/hadoop
          

          This works fine, but it will have /usr/local/hadoop/conf hosting the actual files, and /usr/local/etc/hadoop as symlink to /usr/local/hadoop/conf. I guess you prefer to have /usr/local/etc/hadoop host the file, and /usr/local/hadoop/conf symlink to /usr/local/etc/hadoop. Right? I will make this change.

          There are two possible places to overwrite path configuration. One is at build time, and second one is at installation time.

          Build Time:
          build.properties of the ant build, can overwrite the path locations.

          • package.prefix - Location of binary file prefix
          • package.conf.dir - Location of configuration directory
          • package.log.dir - Location of log directory
          • package.pid.dir - Location of pid directory

          Install Time:
          rpm allows write of the location at installation time by using --relocate directive.
          Debian does not support relocation, hence it needs to be controlled at compile time.

          In the current patch, it does not expose the build time parameters. I will expose the build time parameters in the next patch.

          Nigel,

          I will write a wiki page for this, any recommended location?

          Show
          Eric Yang added a comment - Steve, 1. The RPM is relocatable for /var/log/hadoop to anywhere. During installation, the RPM install phase, update-hadoop-env.sh will adjust log directory to user specified directory. For example: rpm -i hadoop-[version].[arch].rpm \ --relocate /usr=/usr/local/hadoop-0.20.100 \ --relocate /var/log=/usr/local/hadoop-0.20.100/logs \ --relocate /etc/hadoop=/usr/local/hadoop-0.20.100/conf 2. For now, rpmbuild should be sufficient. After hadoop switching to maven, it will be more valuable to have rpm testing in the integration test phase. 3. This should work in both RHEL/CentOS 5.5. This has been tested on RHEL 5.1, 5.5. 4. .deb built should be cross platform (tested on RHEL 5.1), but it requires jdeb. Did you run "ant deb" instead of "ant rpm"? Allen, The current design allows have configuration directory locatable. Example: rpm -i hadoop-[version].[arch].rpm \ --relocate /usr=/usr/local/hadoop \ --relocate /etc/hadoop=/usr/local/etc/hadoop This works fine, but it will have /usr/local/hadoop/conf hosting the actual files, and /usr/local/etc/hadoop as symlink to /usr/local/hadoop/conf. I guess you prefer to have /usr/local/etc/hadoop host the file, and /usr/local/hadoop/conf symlink to /usr/local/etc/hadoop. Right? I will make this change. There are two possible places to overwrite path configuration. One is at build time, and second one is at installation time. Build Time: build.properties of the ant build, can overwrite the path locations. package.prefix - Location of binary file prefix package.conf.dir - Location of configuration directory package.log.dir - Location of log directory package.pid.dir - Location of pid directory Install Time: rpm allows write of the location at installation time by using --relocate directive. Debian does not support relocation, hence it needs to be controlled at compile time. In the current patch, it does not expose the build time parameters. I will expose the build time parameters in the next patch. Nigel, I will write a wiki page for this, any recommended location?
          Hide
          Owen O'Malley added a comment -

          Allen and Steve, I believe that the proposed layout follows the redhat & debian guidelines where all of the arch dependent files go in to $prefix/lib and the arch independent files go into $prefix/share.

          See http://www.debian.org/doc/packaging-manuals/fhs/fhs-2.3.html .

          Documentation for the layout should actually be in the checked in docs and explicitly not in wiki.

          Show
          Owen O'Malley added a comment - Allen and Steve, I believe that the proposed layout follows the redhat & debian guidelines where all of the arch dependent files go in to $prefix/lib and the arch independent files go into $prefix/share. See http://www.debian.org/doc/packaging-manuals/fhs/fhs-2.3.html . Documentation for the layout should actually be in the checked in docs and explicitly not in wiki.
          Hide
          Owen O'Malley added a comment -

          Allen, you're right that HADOOP_HOME would be deprecated once you adopt this, but for the cross-over, we'll need to support a directory that looks like a HADOOP_HOME with symlinks to the real files. Where is the "standard" place to put such a thing?

          Show
          Owen O'Malley added a comment - Allen, you're right that HADOOP_HOME would be deprecated once you adopt this, but for the cross-over, we'll need to support a directory that looks like a HADOOP_HOME with symlinks to the real files. Where is the "standard" place to put such a thing?
          Hide
          Allen Wittenauer added a comment -

          Eric> After hadoop switching to maven, it will be more valuable to have rpm testing
          Eric> in the integration test phase.

          Why would maven make a difference?

          Eric> I guess you prefer to have /usr/local/etc/hadoop host the file,
          Eric> and /usr/local/hadoop/conf symlink to /usr/local/etc/hadoop. Right?

          Correct.

          Eric> rpm allows write of the location at installation time by using --relocate directive.
          Eric> Debian does not support relocation, hence it needs to be controlled at compile time.

          Most packaging systems that at least I'm familiar with don't allow RPM's level of relocation. This is good and bad. In our case, it sure makes it seem like we need to build hadoop-config.sh at install-time, at least in RPMs.

          Eric> I will expose the build time parameters in the next patch.

          Awesome!

          Owen> Allen and Steve, I believe that the proposed layout follows the redhat &
          Owen> debian guidelines where all of the arch dependent files go in to $prefix/lib
          Owen> and the arch independent files go into $prefix/share.

          ObDisclosure/Rant: I don't think FHS is 100% the right way to do things 100% of the time. My particular beef is that I'm not a fan of relatively hefty applications that are typically running on dedicated boxes (Fedora/389 Directory Server, I'm looking at you) strictly following the FHS--and thus scattering files all over the file system. It almost always makes upgrading in place HARD, from both a user and a developer perspective. In the case of Hadoop, it always made sense to me to have a single, consolidated directory because it hits my 'dedicated box' criteria. However, I'm trying to keep an open mind on this one...

          So, anyway, on to jars. I dug into this a bit more. jar files in share is a mixed bag. On a pure technical level, jar files are architecture-independent and would therefore qualify to go to share. But by FHS rules, it looks like lib is just as valid:

          "/usr/lib includes object files, libraries, and internal binaries that are not intended to be executed directly by users or shell scripts."

          (if we keep in mind that java is reading the jar files, not the shell scripts)

          Doing a quick pass through the various OSes I have laying about, I'm finding far more jar files in lib than I am in share. (I don't have Debian installed, but several revs of RPM-based Linuxes. Both RHEL and Debian push the FHS as The Spec. Interestingly enough, only OS X had no jar files in lib and all in share. But I think we can all agree that OS X falls into 'weirdo' category in most cases...)

          FHS, BTW, also has this to say:

          "It is recommended that application-specific, architecture-independent directories be placed here. Such directories include groff, perl, ghostscript, texmf, and kbd (Linux) or syscons (BSD). They may, however, be placed in /usr/lib for backwards compatibility, at the distributor's discretion."

          From here, two things:

          1) So even though they give perl as an example, I know I have yet to work on an OS that was built that way. This is likely due to backward compatibility.

          2) I think putting jar's in share is not in line with the spirit of the text or past history. /usr/share is meant to be content that could be NFS mounted from a common source (i.e., shared) and not essential to a working system. Reading through the FHS gives plenty of examples that meet that intent: documentation, timezone files, dictionaries, and other misc files.

          Owen> but for the cross-over, we'll need to support a directory that looks like a
          Owen> HADOOP_HOME with symlinks to the real files.

          I'm going to play devil's advocate here and ask... Do we? There are times when breaking backward compatibility is a good thing. I'd argue this is a great time to do it. I think we're young enough to get away with it and given that this release will be majorly transitional anyway.... But if you guys are set on this, then what I've typically seen is /usr/(appname-version) dir and populate it, essentially transitioning from a pseudo-SysV layout to FHS.

          Show
          Allen Wittenauer added a comment - Eric> After hadoop switching to maven, it will be more valuable to have rpm testing Eric> in the integration test phase. Why would maven make a difference? Eric> I guess you prefer to have /usr/local/etc/hadoop host the file, Eric> and /usr/local/hadoop/conf symlink to /usr/local/etc/hadoop. Right? Correct. Eric> rpm allows write of the location at installation time by using --relocate directive. Eric> Debian does not support relocation, hence it needs to be controlled at compile time. Most packaging systems that at least I'm familiar with don't allow RPM's level of relocation. This is good and bad. In our case, it sure makes it seem like we need to build hadoop-config.sh at install-time, at least in RPMs. Eric> I will expose the build time parameters in the next patch. Awesome! Owen> Allen and Steve, I believe that the proposed layout follows the redhat & Owen> debian guidelines where all of the arch dependent files go in to $prefix/lib Owen> and the arch independent files go into $prefix/share. ObDisclosure/Rant: I don't think FHS is 100% the right way to do things 100% of the time. My particular beef is that I'm not a fan of relatively hefty applications that are typically running on dedicated boxes (Fedora/389 Directory Server, I'm looking at you) strictly following the FHS--and thus scattering files all over the file system. It almost always makes upgrading in place HARD, from both a user and a developer perspective. In the case of Hadoop, it always made sense to me to have a single, consolidated directory because it hits my 'dedicated box' criteria. However, I'm trying to keep an open mind on this one... So, anyway, on to jars. I dug into this a bit more. jar files in share is a mixed bag. On a pure technical level, jar files are architecture-independent and would therefore qualify to go to share. But by FHS rules, it looks like lib is just as valid: "/usr/lib includes object files, libraries, and internal binaries that are not intended to be executed directly by users or shell scripts." (if we keep in mind that java is reading the jar files, not the shell scripts) Doing a quick pass through the various OSes I have laying about, I'm finding far more jar files in lib than I am in share. (I don't have Debian installed, but several revs of RPM-based Linuxes. Both RHEL and Debian push the FHS as The Spec. Interestingly enough, only OS X had no jar files in lib and all in share. But I think we can all agree that OS X falls into 'weirdo' category in most cases...) FHS, BTW, also has this to say: "It is recommended that application-specific, architecture-independent directories be placed here. Such directories include groff, perl, ghostscript, texmf, and kbd (Linux) or syscons (BSD). They may, however, be placed in /usr/lib for backwards compatibility, at the distributor's discretion." From here, two things: 1) So even though they give perl as an example, I know I have yet to work on an OS that was built that way. This is likely due to backward compatibility. 2) I think putting jar's in share is not in line with the spirit of the text or past history. /usr/share is meant to be content that could be NFS mounted from a common source (i.e., shared) and not essential to a working system. Reading through the FHS gives plenty of examples that meet that intent: documentation, timezone files, dictionaries, and other misc files. Owen> but for the cross-over, we'll need to support a directory that looks like a Owen> HADOOP_HOME with symlinks to the real files. I'm going to play devil's advocate here and ask... Do we? There are times when breaking backward compatibility is a good thing. I'd argue this is a great time to do it. I think we're young enough to get away with it and given that this release will be majorly transitional anyway.... But if you guys are set on this, then what I've typically seen is /usr/(appname-version) dir and populate it, essentially transitioning from a pseudo-SysV layout to FHS.
          Hide
          Eric Yang added a comment -

          Allen> Why would maven make a difference?

          The value in testing RPM installation would be to verify the RPM dependencies are correct.
          The current ant setup has everything bundled in hadoop-core-*.jar file, and the single rpm contains both server and client. It is not possible to componentized RPMs, and test RPM dependencies. Therefore, the installation test is more meaningful after hadoop has been modularized.

          I plan to revise the packages after maven project is done for ensuring deployment setup is inline with the document.

          Show
          Eric Yang added a comment - Allen> Why would maven make a difference? The value in testing RPM installation would be to verify the RPM dependencies are correct. The current ant setup has everything bundled in hadoop-core-*.jar file, and the single rpm contains both server and client. It is not possible to componentized RPMs, and test RPM dependencies. Therefore, the installation test is more meaningful after hadoop has been modularized. I plan to revise the packages after maven project is done for ensuring deployment setup is inline with the document.
          Hide
          Eric Yang added a comment -
          • Revised patch to use /etc/hadoop for hosting configuration file and $PREFIX/share/hadoop/conf is a symlink to /etc/hadoop.
          • Added deployment layout document to Hadoop [version] Documentation -> Common -> Deployment Layout
          Show
          Eric Yang added a comment - Revised patch to use /etc/hadoop for hosting configuration file and $PREFIX/share/hadoop/conf is a symlink to /etc/hadoop. Added deployment layout document to Hadoop [version] Documentation -> Common -> Deployment Layout
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471033/HADOOP-6255-branch-0.20-security-1.patch
          against trunk revision 1070021.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/236//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471033/HADOOP-6255-branch-0.20-security-1.patch against trunk revision 1070021. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/236//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Correction to LZ0 library location

          Show
          Eric Yang added a comment - Correction to LZ0 library location
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471145/HADOOP-6255-branch-0.20-security-2.patch
          against trunk revision 1071084.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/239//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471145/HADOOP-6255-branch-0.20-security-2.patch against trunk revision 1071084. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/239//console This message is automatically generated.
          Hide
          Steve Loughran added a comment -

          I'd propose rather than making this a tweak of the hadoop-common stuff, could we have a downstream project that takes the artifacts (including, later, higher level things) and packages them up.

          1. Putting it into hadoop-common creates a loop in the cycle.
          2. A project purely to do the rpm/deb files would have its own functional tests that you'd point at target machines to install and test, so the packaging comes before testing.
          3. You'd want the packaging and testing of this stuff to be an independent thing to test on hudson from the rest of the code
          4. It may come out on its own release schedule, as you can do point releases of the RPMs that change some of the RPM files without changing the binaries
          5. People not using Unix aren't going to be able to run the <rpmbuild> targets.
          6. We want to give people the right to check out/clone this project, supply new configuration files, and build new RPM/deb files, without rebuilding the rest of the code.

          We've just started discussing on general a hadoop-test project in incubation, what about a hadoop-release project that depends on hadoop-core, hdfs, mapred &c and creates the rpm artifacts

          Show
          Steve Loughran added a comment - I'd propose rather than making this a tweak of the hadoop-common stuff, could we have a downstream project that takes the artifacts (including, later, higher level things) and packages them up. Putting it into hadoop-common creates a loop in the cycle. A project purely to do the rpm/deb files would have its own functional tests that you'd point at target machines to install and test, so the packaging comes before testing. You'd want the packaging and testing of this stuff to be an independent thing to test on hudson from the rest of the code It may come out on its own release schedule, as you can do point releases of the RPMs that change some of the RPM files without changing the binaries People not using Unix aren't going to be able to run the <rpmbuild> targets. We want to give people the right to check out/clone this project, supply new configuration files, and build new RPM/deb files, without rebuilding the rest of the code. We've just started discussing on general a hadoop-test project in incubation, what about a hadoop-release project that depends on hadoop-core, hdfs, mapred &c and creates the rpm artifacts
          Hide
          Owen O'Malley added a comment -

          Steve,
          The rpm and debian targets are of course optional and the tar target will remain. Apache releases are based on the source and not the binary artifacts. That said, keeping the build infrastructure in (and branched with) the project that it is building seems much simpler than having a separate project that needs to be kept in sync.

          Show
          Owen O'Malley added a comment - Steve, The rpm and debian targets are of course optional and the tar target will remain. Apache releases are based on the source and not the binary artifacts. That said, keeping the build infrastructure in (and branched with) the project that it is building seems much simpler than having a separate project that needs to be kept in sync.
          Hide
          Mahadev konar added a comment -

          I agree with owen. I think its best to keep build infrastructure within the project itself. Given that most of the folks run hadoop on unix platforms, it would be great to have rpm builds. Also, keeping it outside is unnecessary work for someone to sync with hadoop builds.

          Show
          Mahadev konar added a comment - I agree with owen. I think its best to keep build infrastructure within the project itself. Given that most of the folks run hadoop on unix platforms, it would be great to have rpm builds. Also, keeping it outside is unnecessary work for someone to sync with hadoop builds.
          Hide
          Eric Yang added a comment -

          There are pros and cons to have packaging outside of the source code. When packaging is outside of source code, it encourages downstream companies like Redhat to aggregate patches and back port features to create working packages. However, It takes much longer to produce a useful release because the usable binary depends on the support of downstream companies rather than hadoop developers. Adoption rate will improve when features can be delivered to customer sooner rather than later. For this reason, the development community should be responsible to govern it's own creation, rather than leaving packaging problems at downstream adaptors.

          There are wide spectrum of testing, like unit test, integration test, performance test, stress test, regression test, etc. Unit test still runs prior to packaging, but integration test should run after packaging. The current setup in hudson is only unit tests. Integration test will come when hadoop converts to use maven, and setup cluster of machines in hudson. When those prerequisite are completed, then it make sense to test package installation and additional tests.

          The goal of this patch is to ensure the dependent projects like pig, hive, hbase, chukwa, mahout, have a good alignment on the project structure to improve development/deployment environment. I plan to submit patches for downstream projects to adopt the same structure, and hopefully smooth out integration pain points among projects.

          Show
          Eric Yang added a comment - There are pros and cons to have packaging outside of the source code. When packaging is outside of source code, it encourages downstream companies like Redhat to aggregate patches and back port features to create working packages. However, It takes much longer to produce a useful release because the usable binary depends on the support of downstream companies rather than hadoop developers. Adoption rate will improve when features can be delivered to customer sooner rather than later. For this reason, the development community should be responsible to govern it's own creation, rather than leaving packaging problems at downstream adaptors. There are wide spectrum of testing, like unit test, integration test, performance test, stress test, regression test, etc. Unit test still runs prior to packaging, but integration test should run after packaging. The current setup in hudson is only unit tests. Integration test will come when hadoop converts to use maven, and setup cluster of machines in hudson. When those prerequisite are completed, then it make sense to test package installation and additional tests. The goal of this patch is to ensure the dependent projects like pig, hive, hbase, chukwa, mahout, have a good alignment on the project structure to improve development/deployment environment. I plan to submit patches for downstream projects to adopt the same structure, and hopefully smooth out integration pain points among projects.
          Hide
          Eric Yang added a comment -

          Revise patch to allow deb package to control path structure during build time.

          Show
          Eric Yang added a comment - Revise patch to allow deb package to control path structure during build time.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471352/HADOOP-6255-branch-0.20-security-3.patch
          against trunk revision 1071364.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/241//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471352/HADOOP-6255-branch-0.20-security-3.patch against trunk revision 1071364. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/241//console This message is automatically generated.
          Hide
          Owen O'Malley added a comment -

          new versions with mentions of yahoo proprietary tools removed

          Show
          Owen O'Malley added a comment - new versions with mentions of yahoo proprietary tools removed
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471424/deployment.tex
          against trunk revision 1071364.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/242//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471424/deployment.tex against trunk revision 1071364. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/242//console This message is automatically generated.
          Hide
          Konstantin Boudnik added a comment -

          I'd propose rather than making this a tweak of the hadoop-common stuff, could we have a downstream project that takes the artifacts (including, later, higher level things) and packages them up.

          +1 this makes a lot of sense.

          Show
          Konstantin Boudnik added a comment - I'd propose rather than making this a tweak of the hadoop-common stuff, could we have a downstream project that takes the artifacts (including, later, higher level things) and packages them up. +1 this makes a lot of sense.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471424/deployment.tex
          against trunk revision 1071364.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/287//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471424/deployment.tex against trunk revision 1071364. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/287//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          HDFS Client throws exception for SecurityAudit.audit if the log directory is configured with $

          {HADOOP_IDENT_STRING}

          as part of the path.

          Show
          Eric Yang added a comment - HDFS Client throws exception for SecurityAudit.audit if the log directory is configured with $ {HADOOP_IDENT_STRING} as part of the path.
          Hide
          Eric Yang added a comment -

          Revise init.d script for RPM for check status.

          Show
          Eric Yang added a comment - Revise init.d script for RPM for check status.
          Hide
          Allen Wittenauer added a comment -

          Going through the doc again (I'll get to the patch later):

          • I'm still a little concerned that the deployment document talks about things that are outside the scope of this JIRA. Even worse, it mentions things that Apache cannot distribute (the LZO code). This should really get sanitized for Hadoop.
          • is include/hdfs.h really the proper location or should it be include/hadoop/hdfs.h?
          Show
          Allen Wittenauer added a comment - Going through the doc again (I'll get to the patch later): I'm still a little concerned that the deployment document talks about things that are outside the scope of this JIRA. Even worse, it mentions things that Apache cannot distribute (the LZO code). This should really get sanitized for Hadoop. is include/hdfs.h really the proper location or should it be include/hadoop/hdfs.h?
          Hide
          Eric Yang added a comment -

          include/hdfs.h is the proper location, if hdfs.h is a public api that dependent project (fuse-dfs) can interface. I don't know the right answer, but include/hdfs.h works for now.

          Show
          Eric Yang added a comment - include/hdfs.h is the proper location, if hdfs.h is a public api that dependent project (fuse-dfs) can interface. I don't know the right answer, but include/hdfs.h works for now.
          Hide
          Eric Yang added a comment -

          Package builder design

          For supporting multiple type of packages, this project layout the packaging source code structure like this:

          src/packages/rpm
                      /deb
                      /conf-pseudo
          

          rpm - meta data for creating rpm packages. SysV init style startup script is also included for start up process in Redhat like environment.

          deb - meta data for creating debian packages. BSD init style startup script is also included for start up process in Ubuntu like environment.

          conf-pseudo - Configuration template for demo pseudo cluster setup. By default both rpm, or deb binary package does not startup the system. The purpose of conf-pseudo is to create a (rpm/deb) package as demonstration of how to setup a single node cluster and turn on services by configuration.

          Software home directory is designed to locate in:

          $

          {prefix}

          /share/$

          {project}

          src/packages/update-${project}

          -env.sh runs in the post installation phase which creates symlinks and making software structure to map to the proposed layout in HADOOP-6255

          /etc/default/$

          {project}-evn.sh is symlinked to the project environment script. Hence, project environment variables are shared across projects.

          Project build file can override the package path in the build phase:

          Sample build.properties
          
          

          package.prefix=/usr
          package.conf.dir=/etc/${project}

          package.log.dir=/var/log/$

          {project}
          package.pid.dir=/var/log/${project}
          For RPM package, it is possible to override location at installation phase by specifying:
          
          

          rpm -i $

          {project}

          [version][rev].[arch].rpm \
          --relocate /usr=/usr/local/hadoop \
          --relocate /etc/hadoop=/usr/local/etc/hadoop \
          --relocate /var/log/hadoop=/opt/logs/hadoop \
          --relocate /var/run/hadoop=/opt/run/hadoop

          
          

          The same build structure can be apply to both ant or maven build scripts. It also expandable to include mac native package installer using this design pattern.

          Show
          Eric Yang added a comment - Package builder design For supporting multiple type of packages, this project layout the packaging source code structure like this: src/packages/rpm /deb /conf-pseudo rpm - meta data for creating rpm packages. SysV init style startup script is also included for start up process in Redhat like environment. deb - meta data for creating debian packages. BSD init style startup script is also included for start up process in Ubuntu like environment. conf-pseudo - Configuration template for demo pseudo cluster setup. By default both rpm, or deb binary package does not startup the system. The purpose of conf-pseudo is to create a (rpm/deb) package as demonstration of how to setup a single node cluster and turn on services by configuration. Software home directory is designed to locate in: $ {prefix} /share/$ {project} src/packages/update-${project} -env.sh runs in the post installation phase which creates symlinks and making software structure to map to the proposed layout in HADOOP-6255 /etc/default/$ {project}-evn.sh is symlinked to the project environment script. Hence, project environment variables are shared across projects. Project build file can override the package path in the build phase: Sample build.properties package.prefix=/usr package.conf.dir=/etc/${project} package.log.dir=/var/log/$ {project} package.pid.dir=/var/log/${project} For RPM package, it is possible to override location at installation phase by specifying: rpm -i $ {project} [version] [rev] . [arch] .rpm \ --relocate /usr=/usr/local/hadoop \ --relocate /etc/hadoop=/usr/local/etc/hadoop \ --relocate /var/log/hadoop=/opt/logs/hadoop \ --relocate /var/run/hadoop=/opt/run/hadoop The same build structure can be apply to both ant or maven build scripts. It also expandable to include mac native package installer using this design pattern.
          Hide
          Eric Yang added a comment -
          • Move start/stop scripts from $PREFIX/bin to $PREFIX/sbin.
          • Added warning message that $HADOOP_HOME is deprecated per Owen's request.
          Show
          Eric Yang added a comment - Move start/stop scripts from $PREFIX/bin to $PREFIX/sbin. Added warning message that $HADOOP_HOME is deprecated per Owen's request.
          Hide
          Eric Yang added a comment -

          Refactor code to add setup-single-node-hadoop.sh script. If user likes to have more control over single node setup procedure, the interactive script could assist the setup procedure instead of hadoop-conf-pseudo setup rpm.

          Show
          Eric Yang added a comment - Refactor code to add setup-single-node-hadoop.sh script. If user likes to have more control over single node setup procedure, the interactive script could assist the setup procedure instead of hadoop-conf-pseudo setup rpm.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12476022/HADOOP-6255-branch-0.20-security-6.patch
          against trunk revision 1094750.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/367//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12476022/HADOOP-6255-branch-0.20-security-6.patch against trunk revision 1094750. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/367//console This message is automatically generated.
          Hide
          Eric Yang added a comment -
          • Revised patch to remove conf-pseudo packaging.
          • Renamed setup-single-node-hadoop.sh to hadoop-setup-single-node.sh
          • Symlink hadoop-config.sh to $PREFIX/libexec
          • Improved detection of user defined HADOOP_HOME
          Show
          Eric Yang added a comment - Revised patch to remove conf-pseudo packaging. Renamed setup-single-node-hadoop.sh to hadoop-setup-single-node.sh Symlink hadoop-config.sh to $PREFIX/libexec Improved detection of user defined HADOOP_HOME
          Hide
          Amr Awadallah added a comment -

          I am out of office and will be slower than usual in responding to
          emails. If this is urgent then please call my cell phone (or send an
          SMS), otherwise I will reply to your email when I get back.

          Thanks for your patience,

          – amr

          Show
          Amr Awadallah added a comment - I am out of office and will be slower than usual in responding to emails. If this is urgent then please call my cell phone (or send an SMS), otherwise I will reply to your email when I get back. Thanks for your patience, – amr
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12477985/HADOOP-6255-branch-0.20-security-7.patch
          against trunk revision 1097322.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/392//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12477985/HADOOP-6255-branch-0.20-security-7.patch against trunk revision 1097322. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/392//console This message is automatically generated.
          Hide
          Konstantin Boudnik added a comment -

          I'm still a little concerned that the deployment document talks about things that are outside the scope of this JIRA. Even worse, it mentions things that Apache cannot distribute (the LZO code). This should really get sanitized for Hadoop.

          A valid point indeed. I think having packing outside of the project itself will address such concerns with ease, because packaging can be licensed differently than Hadoop proper.

          Show
          Konstantin Boudnik added a comment - I'm still a little concerned that the deployment document talks about things that are outside the scope of this JIRA. Even worse, it mentions things that Apache cannot distribute (the LZO code). This should really get sanitized for Hadoop. A valid point indeed. I think having packing outside of the project itself will address such concerns with ease, because packaging can be licensed differently than Hadoop proper.
          Hide
          Eric Yang added a comment -

          A valid point indeed. I think having packing outside of the project itself will address such concerns with ease, because packaging can be licensed differently than Hadoop proper.

          There can be a LZ0 rpm as add-on, this integration project is extensible. Hadoop works properly without LZ0. Hence, the integration jira for Hadoop should focus on stuff on Apache only. Commercial vendors are aggregating hadoop related technologies to make a proper distribution for commercial use. People who wants effortless Hadoop setup can choose to pay commercial vendors for their effort and certification.

          Show
          Eric Yang added a comment - A valid point indeed. I think having packing outside of the project itself will address such concerns with ease, because packaging can be licensed differently than Hadoop proper. There can be a LZ0 rpm as add-on, this integration project is extensible. Hadoop works properly without LZ0. Hence, the integration jira for Hadoop should focus on stuff on Apache only. Commercial vendors are aggregating hadoop related technologies to make a proper distribution for commercial use. People who wants effortless Hadoop setup can choose to pay commercial vendors for their effort and certification.
          Hide
          Eric Yang added a comment -

          Rebase the patch for 0.20 security branch

          Show
          Eric Yang added a comment - Rebase the patch for 0.20 security branch
          Hide
          Eric Yang added a comment -

          Rpm/deb packaging for Hadoop common trunk

          Show
          Eric Yang added a comment - Rpm/deb packaging for Hadoop common trunk
          Hide
          Eric Yang added a comment -

          Rpm/deb packaging for Hadoop trunk

          Show
          Eric Yang added a comment - Rpm/deb packaging for Hadoop trunk
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12478334/HADOOP-6255-hdfs-trunk.patch
          against trunk revision 1099633.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 14 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/408//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478334/HADOOP-6255-hdfs-trunk.patch against trunk revision 1099633. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/408//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Rpm/deb packaging for Hadoop Mapred trunk

          Show
          Eric Yang added a comment - Rpm/deb packaging for Hadoop Mapred trunk
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12478335/HADOOP-6255-mapred-trunk.patch
          against trunk revision 1099633.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/409//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478335/HADOOP-6255-mapred-trunk.patch against trunk revision 1099633. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/409//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Exclude hadoop-env.sh.example from source code, pass in os.arch environment to rpmbuild command.

          Show
          Eric Yang added a comment - Exclude hadoop-env.sh.example from source code, pass in os.arch environment to rpmbuild command.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12478351/HADOOP-6255-mapred-trunk-1.patch
          against trunk revision 1099633.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/412//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478351/HADOOP-6255-mapred-trunk-1.patch against trunk revision 1099633. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/412//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Fix merge conflicts with most recent svn code.
          Polish usage display message for hadoop-setup-hdfs.sh.

          Show
          Eric Yang added a comment - Fix merge conflicts with most recent svn code. Polish usage display message for hadoop-setup-hdfs.sh.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12478624/HADOOP-6255-common-trunk-2.patch
          against trunk revision 1100400.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 27 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 3 release audit warnings (more than the trunk's current 1 warnings).

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/418//testReport/
          Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/418//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/418//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/418//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478624/HADOOP-6255-common-trunk-2.patch against trunk revision 1100400. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 3 release audit warnings (more than the trunk's current 1 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/418//testReport/ Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/418//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/418//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/418//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Update patch to reflect the current state of trunk

          Show
          Eric Yang added a comment - Update patch to reflect the current state of trunk
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12478710/HADOOP-6255-mapred-trunk-2.patch
          against trunk revision 1101199.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/421//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478710/HADOOP-6255-mapred-trunk-2.patch against trunk revision 1101199. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/421//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Rearrange files to proposed directory structure rather than using symlinks for 0.20security branch.

          Show
          Eric Yang added a comment - Rearrange files to proposed directory structure rather than using symlinks for 0.20security branch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12478846/HADOOP-6255-branch-0.20-security-9.patch
          against trunk revision 1101735.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/431//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12478846/HADOOP-6255-branch-0.20-security-9.patch against trunk revision 1101735. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/431//console This message is automatically generated.
          Hide
          Eli Collins added a comment -

          Patch should be against trunk right?

          Show
          Eli Collins added a comment - Patch should be against trunk right?
          Hide
          Eric Yang added a comment -

          For trunk, the third party jar files are duplicated 3 times across common, hdfs, mapred. We should clean up the dependency and build structure to reduce duplications.

          Show
          Eric Yang added a comment - For trunk, the third party jar files are duplicated 3 times across common, hdfs, mapred. We should clean up the dependency and build structure to reduce duplications.
          Hide
          Eric Yang added a comment -

          For trunk, the third party jar files are duplicated 3 times across common, hdfs, mapred. We should clean up the dependency and build structure to reduce duplications.

          Show
          Eric Yang added a comment - For trunk, the third party jar files are duplicated 3 times across common, hdfs, mapred. We should clean up the dependency and build structure to reduce duplications.
          Hide
          Eric Yang added a comment -

          > Patch should be against trunk right?

          Yes, there are 2 set of patches, one for trunk and one for 0.20 security branch.

          Show
          Eric Yang added a comment - > Patch should be against trunk right? Yes, there are 2 set of patches, one for trunk and one for 0.20 security branch.
          Hide
          Eric Yang added a comment -

          Store config templates in $PREFIX/share/hadoop/templates, and change related script to use the new location.

          Show
          Eric Yang added a comment - Store config templates in $PREFIX/share/hadoop/templates, and change related script to use the new location.
          Hide
          Owen O'Malley added a comment -

          For trunk, the default should be $PREFIX/var/log and $PREFIX/var/run.

          Show
          Owen O'Malley added a comment - For trunk, the default should be $PREFIX/var/log and $PREFIX/var/run.
          Hide
          Owen O'Malley added a comment -

          It also looks like you picked up a generated file (configure).

          Show
          Owen O'Malley added a comment - It also looks like you picked up a generated file (configure).
          Hide
          Eric Yang added a comment -

          Set default log and pid directory to $PREFIX/var/log and $PREFIX/var/run respectively for hadoop-env.sh

          Removed incorrect change to src/test/c++/runAs/configure.

          Show
          Eric Yang added a comment - Set default log and pid directory to $PREFIX/var/log and $PREFIX/var/run respectively for hadoop-env.sh Removed incorrect change to src/test/c++/runAs/configure.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12479854/HADOOP-6255-common-trunk-5.patch
          against trunk revision 1125051.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 27 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 3 release audit warnings (more than the trunk's current 1 warnings).

          +1 core tests. The patch passed core unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/485//testReport/
          Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/485//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/485//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/485//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479854/HADOOP-6255-common-trunk-5.patch against trunk revision 1125051. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 3 release audit warnings (more than the trunk's current 1 warnings). +1 core tests. The patch passed core unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/485//testReport/ Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/485//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/485//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/485//console This message is automatically generated.
          Hide
          Eric Yang added a comment -
          • Bugfix for debian package name
          • Use absolute path for setup scripts in case $HADOOP_PREFIX/bin/hadoop is not in user PATH.
          • Fix typo in patch 5.
          Show
          Eric Yang added a comment - Bugfix for debian package name Use absolute path for setup scripts in case $HADOOP_PREFIX/bin/hadoop is not in user PATH. Fix typo in patch 5.
          Hide
          Eric Yang added a comment -

          Fixed 0.20 security branch to use absolute path for setup scripts.

          Show
          Eric Yang added a comment - Fixed 0.20 security branch to use absolute path for setup scripts.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12479934/HADOOP-6255-branch-0.20-security-10.patch
          against trunk revision 1125221.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/492//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479934/HADOOP-6255-branch-0.20-security-10.patch against trunk revision 1125221. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/492//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Change configuration directory from $PREFIX/conf to $PREFIX/etc/hadoop per Owen's recommendation. For RPM/deb, it will use /etc/hadoop as default, and create symlink for $PREFIX/etc/hadoop point to /etc/hadoop.

          Show
          Eric Yang added a comment - Change configuration directory from $PREFIX/conf to $PREFIX/etc/hadoop per Owen's recommendation. For RPM/deb, it will use /etc/hadoop as default, and create symlink for $PREFIX/etc/hadoop point to /etc/hadoop.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480177/HADOOP-6255-common-trunk-7.patch
          against trunk revision 1126719.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 27 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 3 release audit warnings (more than the trunk's current 1 warnings).

          +1 core tests. The patch passed core unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/509//testReport/
          Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/509//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/509//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/509//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480177/HADOOP-6255-common-trunk-7.patch against trunk revision 1126719. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 3 release audit warnings (more than the trunk's current 1 warnings). +1 core tests. The patch passed core unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/509//testReport/ Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/509//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/509//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/509//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Same $PREFIX/etc/hadoop configuration change as trunk but for 0.20 security branch.

          Show
          Eric Yang added a comment - Same $PREFIX/etc/hadoop configuration change as trunk but for 0.20 security branch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480186/HADOOP-6255-branch-0.20-security-11.patch
          against trunk revision 1126719.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/511//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480186/HADOOP-6255-branch-0.20-security-11.patch against trunk revision 1126719. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/511//console This message is automatically generated.
          Hide
          Owen O'Malley added a comment -

          The hadoop-setup-single-node script is creating the config files in the cwd instead of the config directory.

          Show
          Owen O'Malley added a comment - The hadoop-setup-single-node script is creating the config files in the cwd instead of the config directory.
          Hide
          Eric Yang added a comment -

          hadoop-setup-single-node.sh utilize hadoop-setup-conf.sh to generate configuration. haddop-setup-conf.sh is a utility for admin to generate config in cwd, and admin can push out config files to remote nodes with admin's own utilities like rsync or scp. However, the single-node setup was meant to setup automatically with minimal config. It is nicer to have config file generated to HADOOP_CONF_DIR without storing a copy in cwd when hadoop-setup-single-node.sh invokes hadoop-setup-conf.sh.

          The patch is revised to avoid writing to cwd, if automatic setup is choosen.

          Show
          Eric Yang added a comment - hadoop-setup-single-node.sh utilize hadoop-setup-conf.sh to generate configuration. haddop-setup-conf.sh is a utility for admin to generate config in cwd, and admin can push out config files to remote nodes with admin's own utilities like rsync or scp. However, the single-node setup was meant to setup automatically with minimal config. It is nicer to have config file generated to HADOOP_CONF_DIR without storing a copy in cwd when hadoop-setup-single-node.sh invokes hadoop-setup-conf.sh. The patch is revised to avoid writing to cwd, if automatic setup is choosen.
          Hide
          Eric Yang added a comment -

          Enhancement to hadoop-setup-single-node.sh:

          • Delete datanode directory, if namenode format is chosen to avoid incompatible version of datanode directory left over.
          Show
          Eric Yang added a comment - Enhancement to hadoop-setup-single-node.sh: Delete datanode directory, if namenode format is chosen to avoid incompatible version of datanode directory left over.
          Hide
          Eric Yang added a comment -

          Reduce the safemode threshold-pct to 1.0f, and safemode.extension to 3 for default template to improve initial setup user experience of using hadoop-setup-single-node.sh.

          Show
          Eric Yang added a comment - Reduce the safemode threshold-pct to 1.0f, and safemode.extension to 3 for default template to improve initial setup user experience of using hadoop-setup-single-node.sh.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480476/HADOOP-6255-common-trunk-10.patch
          against trunk revision 1127811.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 27 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/527//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/527//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/527//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480476/HADOOP-6255-common-trunk-10.patch against trunk revision 1127811. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/527//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/527//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/527//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Replace setup script sudo command with su for external management software to invoke the script without tty.

          Show
          Eric Yang added a comment - Replace setup script sudo command with su for external management software to invoke the script without tty.
          Hide
          Eric Yang added a comment -

          Replace setup script sudo command with su for external management software to invoke the script without tty, same patch for 0.20 security branch.

          Show
          Eric Yang added a comment - Replace setup script sudo command with su for external management software to invoke the script without tty, same patch for 0.20 security branch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480556/HADOOP-6255-branch-0.20-security-12.patch
          against trunk revision 1127811.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/529//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480556/HADOOP-6255-branch-0.20-security-12.patch against trunk revision 1127811. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/529//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Minor bugfix in hadoop-setup-single-node.sh to work on correctly in Debian.

          Show
          Eric Yang added a comment - Minor bugfix in hadoop-setup-single-node.sh to work on correctly in Debian.
          Hide
          Eric Yang added a comment -

          Minor bug fixes:

          • Removed hadoop-config.sh from /usr/sbin
          • export environment to su -c in hadoop-setup-single-node.sh and hadoop-create-user.sh
          • Removed *.debian and *.redhat from /usr/sbin
          • Renamed hadoop to hadoop-common, and hadoop-mapred to hadoop-mapreduce
          Show
          Eric Yang added a comment - Minor bug fixes: Removed hadoop-config.sh from /usr/sbin export environment to su -c in hadoop-setup-single-node.sh and hadoop-create-user.sh Removed *.debian and *.redhat from /usr/sbin Renamed hadoop to hadoop-common, and hadoop-mapred to hadoop-mapreduce
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480593/HADOOP-6255-common-trunk-12.patch
          against trunk revision 1128003.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 27 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/532//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/532//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/532//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480593/HADOOP-6255-common-trunk-12.patch against trunk revision 1128003. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/532//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/532//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/532//console This message is automatically generated.
          Hide
          Owen O'Malley added a comment -

          I just committed this. Thanks, Eric!

          Show
          Owen O'Malley added a comment - I just committed this. Thanks, Eric!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #625 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk-Commit/625/)
          HADOOP-6255. Create RPM and Debian packages for common. Changes deployment
          layout to be consistent across the binary tgz, rpm, and deb. Adds setup
          scripts for easy one node cluster configuration and user creation.
          (Eric Yang via omalley)

          omalley : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128385
          Files :

          • /hadoop/common/trunk/src/packages/deb/hadoop.control/postinst
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/mapred_tutorial.xml
          • /hadoop/common/trunk/bin/stop-all.sh
          • /hadoop/common/trunk/CHANGES.txt
          • /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-tasktracker
          • /hadoop/common/trunk/src/packages/hadoop-setup-single-node.sh
          • /hadoop/common/trunk/src/test/system/c++/runAs/runAs.c
          • /hadoop/common/trunk/bin/hadoop-daemon.sh
          • /hadoop/common/trunk/src/packages/rpm/spec/hadoop.spec
          • /hadoop/common/trunk/src/packages/deb/hadoop.control/control
          • /hadoop/common/trunk/src/packages/deb/hadoop.control/prerm
          • /hadoop/common/trunk/bin/hadoop-config.sh
          • /hadoop/common/trunk/src/packages/templates
          • /hadoop/common/trunk/src/native/Makefile.am
          • /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml
          • /hadoop/common/trunk/src/packages/deb/init.d/hadoop-namenode
          • /hadoop/common/trunk/src/packages/templates/conf
          • /hadoop/common/trunk/src/packages/deb
          • /hadoop/common/trunk/src/packages/deb/hadoop.control/preinst
          • /hadoop/common/trunk/ivy/libraries.properties
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/cluster_setup.xml
          • /hadoop/common/trunk/bin/hadoop
          • /hadoop/common/trunk/src/packages/deb/init.d/hadoop-tasktracker
          • /hadoop/common/trunk/src/test/system/c++/runAs/configure
          • /hadoop/common/trunk/src/native/src/org/apache/hadoop/io/compress/zlib/Makefile.am
          • /hadoop/common/trunk/bin/slaves.sh
          • /hadoop/common/trunk/src/packages/deb/init.d/hadoop-jobtracker
          • /hadoop/common/trunk/src/packages
          • /hadoop/common/trunk/src/packages/templates/conf/core-site.xml
          • /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-namenode
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/hod_admin_guide.xml
          • /hadoop/common/trunk/bin/rcc
          • /hadoop/common/trunk/src/packages/deb/init.d
          • /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/site.xml
          • /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/single_node_setup.xml
          • /hadoop/common/trunk/src/native/packageNativeHadoop.sh
          • /hadoop/common/trunk/src/packages/deb/init.d/hadoop-datanode
          • /hadoop/common/trunk/src/test/system/c++/runAs/runAs.h.in
          • /hadoop/common/trunk/src/packages/update-hadoop-env.sh
          • /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-jobtracker
          • /hadoop/common/trunk/src/packages/rpm/spec
          • /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/quickstart.xml
          • /hadoop/common/trunk/bin/hadoop-daemons.sh
          • /hadoop/common/trunk/conf/hadoop-env.sh.template
          • /hadoop/common/trunk/ivy.xml
          • /hadoop/common/trunk/build.xml
          • /hadoop/common/trunk/src/packages/hadoop-setup-hdfs.sh
          • /hadoop/common/trunk/src/packages/deb/hadoop.control
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/commands_manual.xml
          • /hadoop/common/trunk/src/packages/deb/hadoop.control/postrm
          • /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/deployment_layout.xml
          • /hadoop/common/trunk/src/test/system/java/org/apache/hadoop/test/system/process/HadoopDaemonRemoteCluster.java
          • /hadoop/common/trunk/src/packages/rpm/init.d
          • /hadoop/common/trunk/bin/start-all.sh
          • /hadoop/common/trunk/src/test/system/c++/runAs/configure.ac
          • /hadoop/common/trunk/src/packages/hadoop-setup-conf.sh
          • /hadoop/common/trunk/src/packages/deb/hadoop.control/conffile
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/streaming.xml
          • /hadoop/common/trunk/src/packages/hadoop-create-user.sh
          • /hadoop/common/trunk/src/packages/rpm
          • /hadoop/common/trunk/src/native/lib/Makefile.am
          • /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-datanode
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #625 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk-Commit/625/ ) HADOOP-6255 . Create RPM and Debian packages for common. Changes deployment layout to be consistent across the binary tgz, rpm, and deb. Adds setup scripts for easy one node cluster configuration and user creation. (Eric Yang via omalley) omalley : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128385 Files : /hadoop/common/trunk/src/packages/deb/hadoop.control/postinst /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/mapred_tutorial.xml /hadoop/common/trunk/bin/stop-all.sh /hadoop/common/trunk/CHANGES.txt /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-tasktracker /hadoop/common/trunk/src/packages/hadoop-setup-single-node.sh /hadoop/common/trunk/src/test/system/c++/runAs/runAs.c /hadoop/common/trunk/bin/hadoop-daemon.sh /hadoop/common/trunk/src/packages/rpm/spec/hadoop.spec /hadoop/common/trunk/src/packages/deb/hadoop.control/control /hadoop/common/trunk/src/packages/deb/hadoop.control/prerm /hadoop/common/trunk/bin/hadoop-config.sh /hadoop/common/trunk/src/packages/templates /hadoop/common/trunk/src/native/Makefile.am /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml /hadoop/common/trunk/src/packages/deb/init.d/hadoop-namenode /hadoop/common/trunk/src/packages/templates/conf /hadoop/common/trunk/src/packages/deb /hadoop/common/trunk/src/packages/deb/hadoop.control/preinst /hadoop/common/trunk/ivy/libraries.properties /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/cluster_setup.xml /hadoop/common/trunk/bin/hadoop /hadoop/common/trunk/src/packages/deb/init.d/hadoop-tasktracker /hadoop/common/trunk/src/test/system/c++/runAs/configure /hadoop/common/trunk/src/native/src/org/apache/hadoop/io/compress/zlib/Makefile.am /hadoop/common/trunk/bin/slaves.sh /hadoop/common/trunk/src/packages/deb/init.d/hadoop-jobtracker /hadoop/common/trunk/src/packages /hadoop/common/trunk/src/packages/templates/conf/core-site.xml /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-namenode /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/hod_admin_guide.xml /hadoop/common/trunk/bin/rcc /hadoop/common/trunk/src/packages/deb/init.d /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/site.xml /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/single_node_setup.xml /hadoop/common/trunk/src/native/packageNativeHadoop.sh /hadoop/common/trunk/src/packages/deb/init.d/hadoop-datanode /hadoop/common/trunk/src/test/system/c++/runAs/runAs.h.in /hadoop/common/trunk/src/packages/update-hadoop-env.sh /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-jobtracker /hadoop/common/trunk/src/packages/rpm/spec /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/quickstart.xml /hadoop/common/trunk/bin/hadoop-daemons.sh /hadoop/common/trunk/conf/hadoop-env.sh.template /hadoop/common/trunk/ivy.xml /hadoop/common/trunk/build.xml /hadoop/common/trunk/src/packages/hadoop-setup-hdfs.sh /hadoop/common/trunk/src/packages/deb/hadoop.control /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/commands_manual.xml /hadoop/common/trunk/src/packages/deb/hadoop.control/postrm /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/deployment_layout.xml /hadoop/common/trunk/src/test/system/java/org/apache/hadoop/test/system/process/HadoopDaemonRemoteCluster.java /hadoop/common/trunk/src/packages/rpm/init.d /hadoop/common/trunk/bin/start-all.sh /hadoop/common/trunk/src/test/system/c++/runAs/configure.ac /hadoop/common/trunk/src/packages/hadoop-setup-conf.sh /hadoop/common/trunk/src/packages/deb/hadoop.control/conffile /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/streaming.xml /hadoop/common/trunk/src/packages/hadoop-create-user.sh /hadoop/common/trunk/src/packages/rpm /hadoop/common/trunk/src/native/lib/Makefile.am /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-datanode
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Great job! We finally have RPM in Apache Hadoop.

          Show
          Tsz Wo Nicholas Sze added a comment - Great job! We finally have RPM in Apache Hadoop.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #702 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk/702/)
          HADOOP-6255. Create RPM and Debian packages for common. Changes deployment
          layout to be consistent across the binary tgz, rpm, and deb. Adds setup
          scripts for easy one node cluster configuration and user creation.
          (Eric Yang via omalley)

          omalley : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128385
          Files :

          • /hadoop/common/trunk/src/packages/deb/hadoop.control/postinst
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/mapred_tutorial.xml
          • /hadoop/common/trunk/bin/stop-all.sh
          • /hadoop/common/trunk/CHANGES.txt
          • /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-tasktracker
          • /hadoop/common/trunk/src/packages/hadoop-setup-single-node.sh
          • /hadoop/common/trunk/src/test/system/c++/runAs/runAs.c
          • /hadoop/common/trunk/bin/hadoop-daemon.sh
          • /hadoop/common/trunk/src/packages/rpm/spec/hadoop.spec
          • /hadoop/common/trunk/src/packages/deb/hadoop.control/control
          • /hadoop/common/trunk/src/packages/deb/hadoop.control/prerm
          • /hadoop/common/trunk/bin/hadoop-config.sh
          • /hadoop/common/trunk/src/packages/templates
          • /hadoop/common/trunk/src/native/Makefile.am
          • /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml
          • /hadoop/common/trunk/src/packages/deb/init.d/hadoop-namenode
          • /hadoop/common/trunk/src/packages/templates/conf
          • /hadoop/common/trunk/src/packages/deb
          • /hadoop/common/trunk/src/packages/deb/hadoop.control/preinst
          • /hadoop/common/trunk/ivy/libraries.properties
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/cluster_setup.xml
          • /hadoop/common/trunk/bin/hadoop
          • /hadoop/common/trunk/src/packages/deb/init.d/hadoop-tasktracker
          • /hadoop/common/trunk/src/test/system/c++/runAs/configure
          • /hadoop/common/trunk/src/native/src/org/apache/hadoop/io/compress/zlib/Makefile.am
          • /hadoop/common/trunk/bin/slaves.sh
          • /hadoop/common/trunk/src/packages/deb/init.d/hadoop-jobtracker
          • /hadoop/common/trunk/src/packages
          • /hadoop/common/trunk/src/packages/templates/conf/core-site.xml
          • /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-namenode
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/hod_admin_guide.xml
          • /hadoop/common/trunk/bin/rcc
          • /hadoop/common/trunk/src/packages/deb/init.d
          • /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/site.xml
          • /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/single_node_setup.xml
          • /hadoop/common/trunk/src/native/packageNativeHadoop.sh
          • /hadoop/common/trunk/src/packages/deb/init.d/hadoop-datanode
          • /hadoop/common/trunk/src/test/system/c++/runAs/runAs.h.in
          • /hadoop/common/trunk/src/packages/update-hadoop-env.sh
          • /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-jobtracker
          • /hadoop/common/trunk/src/packages/rpm/spec
          • /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/quickstart.xml
          • /hadoop/common/trunk/bin/hadoop-daemons.sh
          • /hadoop/common/trunk/conf/hadoop-env.sh.template
          • /hadoop/common/trunk/ivy.xml
          • /hadoop/common/trunk/build.xml
          • /hadoop/common/trunk/src/packages/hadoop-setup-hdfs.sh
          • /hadoop/common/trunk/src/packages/deb/hadoop.control
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/commands_manual.xml
          • /hadoop/common/trunk/src/packages/deb/hadoop.control/postrm
          • /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/deployment_layout.xml
          • /hadoop/common/trunk/src/test/system/java/org/apache/hadoop/test/system/process/HadoopDaemonRemoteCluster.java
          • /hadoop/common/trunk/src/packages/rpm/init.d
          • /hadoop/common/trunk/bin/start-all.sh
          • /hadoop/common/trunk/src/test/system/c++/runAs/configure.ac
          • /hadoop/common/trunk/src/packages/hadoop-setup-conf.sh
          • /hadoop/common/trunk/src/packages/deb/hadoop.control/conffile
          • /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/streaming.xml
          • /hadoop/common/trunk/src/packages/hadoop-create-user.sh
          • /hadoop/common/trunk/src/packages/rpm
          • /hadoop/common/trunk/src/native/lib/Makefile.am
          • /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-datanode
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #702 (See https://builds.apache.org/hudson/job/Hadoop-Common-trunk/702/ ) HADOOP-6255 . Create RPM and Debian packages for common. Changes deployment layout to be consistent across the binary tgz, rpm, and deb. Adds setup scripts for easy one node cluster configuration and user creation. (Eric Yang via omalley) omalley : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128385 Files : /hadoop/common/trunk/src/packages/deb/hadoop.control/postinst /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/mapred_tutorial.xml /hadoop/common/trunk/bin/stop-all.sh /hadoop/common/trunk/CHANGES.txt /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-tasktracker /hadoop/common/trunk/src/packages/hadoop-setup-single-node.sh /hadoop/common/trunk/src/test/system/c++/runAs/runAs.c /hadoop/common/trunk/bin/hadoop-daemon.sh /hadoop/common/trunk/src/packages/rpm/spec/hadoop.spec /hadoop/common/trunk/src/packages/deb/hadoop.control/control /hadoop/common/trunk/src/packages/deb/hadoop.control/prerm /hadoop/common/trunk/bin/hadoop-config.sh /hadoop/common/trunk/src/packages/templates /hadoop/common/trunk/src/native/Makefile.am /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml /hadoop/common/trunk/src/packages/deb/init.d/hadoop-namenode /hadoop/common/trunk/src/packages/templates/conf /hadoop/common/trunk/src/packages/deb /hadoop/common/trunk/src/packages/deb/hadoop.control/preinst /hadoop/common/trunk/ivy/libraries.properties /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/cluster_setup.xml /hadoop/common/trunk/bin/hadoop /hadoop/common/trunk/src/packages/deb/init.d/hadoop-tasktracker /hadoop/common/trunk/src/test/system/c++/runAs/configure /hadoop/common/trunk/src/native/src/org/apache/hadoop/io/compress/zlib/Makefile.am /hadoop/common/trunk/bin/slaves.sh /hadoop/common/trunk/src/packages/deb/init.d/hadoop-jobtracker /hadoop/common/trunk/src/packages /hadoop/common/trunk/src/packages/templates/conf/core-site.xml /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-namenode /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/hod_admin_guide.xml /hadoop/common/trunk/bin/rcc /hadoop/common/trunk/src/packages/deb/init.d /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/site.xml /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/single_node_setup.xml /hadoop/common/trunk/src/native/packageNativeHadoop.sh /hadoop/common/trunk/src/packages/deb/init.d/hadoop-datanode /hadoop/common/trunk/src/test/system/c++/runAs/runAs.h.in /hadoop/common/trunk/src/packages/update-hadoop-env.sh /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-jobtracker /hadoop/common/trunk/src/packages/rpm/spec /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/commands_manual.xml /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/quickstart.xml /hadoop/common/trunk/bin/hadoop-daemons.sh /hadoop/common/trunk/conf/hadoop-env.sh.template /hadoop/common/trunk/ivy.xml /hadoop/common/trunk/build.xml /hadoop/common/trunk/src/packages/hadoop-setup-hdfs.sh /hadoop/common/trunk/src/packages/deb/hadoop.control /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/commands_manual.xml /hadoop/common/trunk/src/packages/deb/hadoop.control/postrm /hadoop/common/trunk/src/docs/src/documentation/content/xdocs/deployment_layout.xml /hadoop/common/trunk/src/test/system/java/org/apache/hadoop/test/system/process/HadoopDaemonRemoteCluster.java /hadoop/common/trunk/src/packages/rpm/init.d /hadoop/common/trunk/bin/start-all.sh /hadoop/common/trunk/src/test/system/c++/runAs/configure.ac /hadoop/common/trunk/src/packages/hadoop-setup-conf.sh /hadoop/common/trunk/src/packages/deb/hadoop.control/conffile /hadoop/common/trunk/src/docs/cn/src/documentation/content/xdocs/streaming.xml /hadoop/common/trunk/src/packages/hadoop-create-user.sh /hadoop/common/trunk/src/packages/rpm /hadoop/common/trunk/src/native/lib/Makefile.am /hadoop/common/trunk/src/packages/rpm/init.d/hadoop-datanode
          Hide
          Owen O'Malley added a comment -

          Hadoop 0.20.204.0 was released today.

          Show
          Owen O'Malley added a comment - Hadoop 0.20.204.0 was released today.

            People

            • Assignee:
              Eric Yang
              Reporter:
              Owen O'Malley
            • Votes:
              1 Vote for this issue
              Watchers:
              31 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development