Bigtop
  1. Bigtop
  2. BIGTOP-713

use newer debhelper and source format 3.0 (quilt) for Debian and Ubuntu packaging

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.6.0
    • Component/s: debian
    • Labels:
      None

      Description

      debhelper can automate a lot of common things in debian package creation.

      The current packages use an old style of debhelper, that often is unnecessarily complicated, making it harder to fix things.

      For example, current Hadoop (0.23.3) does not compile on Debian because of the new GCC version. The fix is a simple "include <unistd.h>" in the HadoopPipes.cc file.

      Modern Debian packaging with "quilt" has an excellent mechanism for managing such patches. However, in order to use this with the current Bigtop packaging, one has to 1. create debian/source/format to use "3.0 (quilt)" 2. manually add quilt patching to the debian/rules targets. 3. making sure the .debian.tar.gz is also copied instead of the old .diff.gz

      You will be surprised how many things debhelper does well on its own with a rules file consisting just of little more than the automagic:

      %:
      dh $@

      Furthermore, "java-wrappers" is a Debian and Ubuntu package that helps with setting up classpaths and choosing the JVM. It can do all of bigtop-utils and more, and it is used by other Java packages. IMHO it should be preferred instead.

      If the packaging would be more Debian-standard, it would be alot easier to get the packages at some point accepted into Debian mainline. It may even be desirable to build the various hadoop components (-commmon, -yarn etc.) independently if they are isolated well enough upstream.

      Don't get me wrong. I think the packages are pretty good already. In particularly I like the split into namenode and datanode packages and the use of update-alternatives, for example. I just found it rather hard to get a grip of the process and to get my fixes into the package. For example, I had to manually set JAVA_HOME before building, some build dependencies were missing (cmake, but it probably is a new requirement), some paths have changed (probably the yarn promotion to a top level project?)
      I understand that you want to have as much common code for all distributions as possible, as opposed to having per-distribution packaging. However, if every project uses its own specific version of java-wrappers and build process, things will not really be better than if it is at least consistent across the various distributions.
      But ideally, there should be very little packaging code needed anyway, and most things be done by an appropriate installation process upstream.

      And seriously, /usr/lib/hadoop/lib is a *mess. There even is a package in there with a "" in the file name. Plus, a lot of these jars are available in Debian, and could be shared across packages if the packages would accept them to be managed by the distribution instead of shipping their own...

      Even within the bigtop packages this leads to a totally unnecessary overlap:

      995720 Sep 25 14:18 /usr/lib/hadoop-hdfs/lib/snappy-java-1.0.3.2.jar
      995720 Sep 25 14:18 /usr/lib/hadoop-mapreduce/lib/snappy-java-1.0.3.2.jar
      995720 Sep 25 14:18 /usr/lib/hadoop-yarn/lib/snappy-java-1.0.3.2.jar
      [...]

      1. BIGTOP-713.anatoli.partial.patch
        18 kB
        Anatoli Fomenko
      2. BIGTOP-713.historyserver.patch
        2 kB
        Anatoli Fomenko
      3. BIGTOP-713.mackrorysd.partial.patch.2
        17 kB
        Sean Mackrory
      4. BIGTOP-713.patch
        7 kB
        James Page
      5. BIGTOP-713.patch.final.txt
        48 kB
        Roman Shaposhnik
      6. BIGTOP-713.rvs.partial.patch.txt
        7 kB
        Roman Shaposhnik

        Issue Links

          Activity

          Hide
          James Page added a comment -

          Re source formats for Debian packages - we discussed this on the -dev mailing list a while back; I even volunteered to that a look but #fail I've not managed to spend any time other than teaching the build process to understand *.debian.tar.gz that source/format 3.0 produces.

          I need to commit some time to looking at this for the next release.

          Show
          James Page added a comment - Re source formats for Debian packages - we discussed this on the -dev mailing list a while back; I even volunteered to that a look but #fail I've not managed to spend any time other than teaching the build process to understand *.debian.tar.gz that source/format 3.0 produces. I need to commit some time to looking at this for the next release.
          Hide
          James Page added a comment -

          Also I don't think that the objective of bigtop is or should be to get packages accepted into Debian mainline. Although as you state some of the dependencies are packaged (albeit at different versions) the amount of effort required to fill the gaps should not be underestimated; hadoop has been in Debian before but the time commitment was to much for the original maintainer and it was dropped.

          Show
          James Page added a comment - Also I don't think that the objective of bigtop is or should be to get packages accepted into Debian mainline. Although as you state some of the dependencies are packaged (albeit at different versions) the amount of effort required to fill the gaps should not be underestimated; hadoop has been in Debian before but the time commitment was to much for the original maintainer and it was dropped.
          Hide
          Roman Shaposhnik added a comment -

          First of all, we would love for some of the core Debian developers/maintainers to help us with making Bigtop a better citizen there. Especially when it comes to managign Java. If you're interested – it would be extremely nice to have a thread on the bigtop-dev mailing list aimed at helping us implement some of the improvements.

          For example, current Hadoop (0.23.3)

          Just to clarify something: Hadoop 0.23.3 is really not a current release of Hadoop. If you want stable pre-YARN Hadoop go with 1.X code line, if you want YARN and the latest HDFS goodness go with 2.X. I don't think outside of a few use cases there's any reason to use Hadoop 0.23.3 today.

          Modern Debian packaging with "quilt" has an excellent mechanism for managing such patches.

          Bigtop has had a policy of 0 patching so far. We never ever patch upstream components for our own releases. That said, providing such capability would be useful for folks who need Bigtop as a Hadoop stack management system.

          Furthermore, "java-wrappers" is a Debian and Ubuntu package

          Could you elaborate on what functionality you'd suggest us to leverage from there? Also, since we have to support a variety of different distros – whether something similar is available in the rest of them? BIGTOP-276 aims at solving the most thorny issue of all – classpath management in the presence of conflicting requirements (e.g. Hadoop wanting X.Y version of guava.jar and Zookeeper wanting A.B version, etc.).

          And seriously, /usr/lib/hadoop/lib is a *mess. There even is a package in there with a "" in the file name.

          Couldn't agree more As I said – anything that can help us sort out the classpath hell should be discussed on BIGTOP-276. We definitely shouldn't be shipping identical jars (at least symlinks should be done) but I really don't think we can get rid of the requirement of shipping different versions of the same jar to satisfy requirements of different project in the Hadoop ecosystem (that is also the reason why we can't simply depend on the jars provided by the distribution).

          Anyway, any kind of help will definitely be appreciated provided that changes are applicable to Lucid+ Ubuntus and lenny+ debians.

          Show
          Roman Shaposhnik added a comment - First of all, we would love for some of the core Debian developers/maintainers to help us with making Bigtop a better citizen there. Especially when it comes to managign Java. If you're interested – it would be extremely nice to have a thread on the bigtop-dev mailing list aimed at helping us implement some of the improvements. For example, current Hadoop (0.23.3) Just to clarify something: Hadoop 0.23.3 is really not a current release of Hadoop. If you want stable pre-YARN Hadoop go with 1.X code line, if you want YARN and the latest HDFS goodness go with 2.X. I don't think outside of a few use cases there's any reason to use Hadoop 0.23.3 today. Modern Debian packaging with "quilt" has an excellent mechanism for managing such patches. Bigtop has had a policy of 0 patching so far. We never ever patch upstream components for our own releases. That said, providing such capability would be useful for folks who need Bigtop as a Hadoop stack management system. Furthermore, "java-wrappers" is a Debian and Ubuntu package Could you elaborate on what functionality you'd suggest us to leverage from there? Also, since we have to support a variety of different distros – whether something similar is available in the rest of them? BIGTOP-276 aims at solving the most thorny issue of all – classpath management in the presence of conflicting requirements (e.g. Hadoop wanting X.Y version of guava.jar and Zookeeper wanting A.B version, etc.). And seriously, /usr/lib/hadoop/lib is a *mess. There even is a package in there with a "" in the file name. Couldn't agree more As I said – anything that can help us sort out the classpath hell should be discussed on BIGTOP-276 . We definitely shouldn't be shipping identical jars (at least symlinks should be done) but I really don't think we can get rid of the requirement of shipping different versions of the same jar to satisfy requirements of different project in the Hadoop ecosystem (that is also the reason why we can't simply depend on the jars provided by the distribution). Anyway, any kind of help will definitely be appreciated provided that changes are applicable to Lucid+ Ubuntus and lenny+ debians.
          Hide
          Erich Schubert added a comment -

          Hadoop 0.23.3

          I was under the impression that 0.23.3 is the current hadoop release. The version numbering of hadoop is a mess. If you read the changelog for 2.0.0-alpha, the first line identifies it as 0.23.1 - same for 2.0.1-alpha. So I was under the assumption that 0.23.3 - the latest release, 2 months after 2.0.1-alpha - was actually the newest version. Just that nobody rebranded it as 2.0.2-alpha or so. And upstream subversion uses 3.0.0 everywhere IIRC.

          patching

          Debian in would love to also have 0 patches. However, if you want to get a bug fixed quickly for your users, it often is best to fix it, make a patch, send that out to user users for testin and upstream for inclusion. Debian changelogs are full of entries like "remove patches ..., included upstream" (and also patches that were solved differently by upstream). But in fact the compile fix I mentioned - fixed in Hadoop SVN the same way - is a good example for the need of patching. It won't compile otherwise, so with a 0 patching policy this means you cannot build Hadoop Pipes on current Debian, because it has a too new GCC.

          conflicting library versions

          Again, this is a problem that does not only affect Hadoop. In my personal opinion, it is a consequence of how dependencies are handled in the Java community. You leave it to your users (and maven) to get all the jars you need. If people would care more about having to use one system to manage the dependencies for all of the java software they use, they would be more aware of such conflicts. And of course they also occur with binary libaries. It is common for distributions to take care of this, and they will also try to offer multiple version of a library when incompatible.
          And in some cases, it is easiest to patch (or recompile, for binary packages) some dependant software, to only have to provide one version of a library.

          Debian java packaging already manages a symlink farm of the type:
          xml-apis-ext.jar -> xml-apis-ext-1.4.01.jar

          So packages can use "any version of xml-apis-ext.jar", for example. For explicit version dependencies, you would have a versioned depend on the package, obviously. Most of the version dependencies are a ">= x.y" type, a few are of the type "< z" (when e.g. an API changes for a major version).

          When it is known that a package breaks API compatibility, the distributions should take care to make them installable at the same time. For example GNU trove 2 and GNU trove 3 are not API compatible. Debian ships them as "trove.jar -> trove-2.x.y.jar" and "trove-3.jar -> trove-3.0.3.jar" symlinks. So far, the packages depending on trove 2 or trove 3 continue to work...

          .bq java-wrappers

          I believe they allow apps to specify e.g. "java6", "java7" and the wrappers may choose a different java runtime than the system default.

          A typical java wrappers script looks like this:

          #!/bin/sh
          . /usr/lib/java-wrappers/java-wrappers.sh
          find_java_runtime openjdk6 sun6
          find_jars app batik fop
          run_java mainclass "$@"
          

          Where find_jars will take care of setting up the classpath. I havn't looked into the details of how you would specify a versioned requirement. With trove, you would use trove-3.jar. Furthermore, many jars may already include other jars - when in the system folder, so it actually works well with the debian installed jars - via Class-path attribute in the manifest. Ideally, jars in Debian are packaged with such dependencies. For example fop.jar specifies commons-io.jar xercesImpl.jar xalan2.jar etc. Above example could even be simplified: batik is needed by fop, so we could leave it away.
          Debian also ships some projects split into numerous smaller jar files. Batik is a good example. There is batik-all containing all of batik, but there are also smaller jars containing e.g. only the parser. So that a project that tries to reduce memory requirements can also just load that part of the batik into the classpath that is needed.

          It's probably not perfect - the debian java team seems to be a bit underpowered, as so often (they for sure currently do not have the power to do Hadoop packages) - but they do seem to work on a manageable java ecosystem. Often such infrastructure things need some users to spread across distributions. I don't know what redhat has for managing java, maybe more, maybe less. The "alternatives" thing is a good example of infrastructure utilities adopted across distributions over time. Quoting from an internet page:

          .bq Fedora's implementation of alternatives is a rewrite and extension of the alternatives system used in Debian.

          So if java-wrappers are useful for Bigtop, it may be very manageable to have them adopted by the Fedora ecosystem; while bigtop-utils is not yet adopted by either I guess.

          Show
          Erich Schubert added a comment - Hadoop 0.23.3 I was under the impression that 0.23.3 is the current hadoop release. The version numbering of hadoop is a mess. If you read the changelog for 2.0.0-alpha, the first line identifies it as 0.23.1 - same for 2.0.1-alpha. So I was under the assumption that 0.23.3 - the latest release, 2 months after 2.0.1-alpha - was actually the newest version. Just that nobody rebranded it as 2.0.2-alpha or so. And upstream subversion uses 3.0.0 everywhere IIRC. patching Debian in would love to also have 0 patches. However, if you want to get a bug fixed quickly for your users, it often is best to fix it, make a patch, send that out to user users for testin and upstream for inclusion. Debian changelogs are full of entries like "remove patches ..., included upstream" (and also patches that were solved differently by upstream). But in fact the compile fix I mentioned - fixed in Hadoop SVN the same way - is a good example for the need of patching. It won't compile otherwise, so with a 0 patching policy this means you cannot build Hadoop Pipes on current Debian, because it has a too new GCC. conflicting library versions Again, this is a problem that does not only affect Hadoop. In my personal opinion, it is a consequence of how dependencies are handled in the Java community. You leave it to your users (and maven) to get all the jars you need. If people would care more about having to use one system to manage the dependencies for all of the java software they use, they would be more aware of such conflicts. And of course they also occur with binary libaries. It is common for distributions to take care of this, and they will also try to offer multiple version of a library when incompatible. And in some cases, it is easiest to patch (or recompile, for binary packages) some dependant software, to only have to provide one version of a library. Debian java packaging already manages a symlink farm of the type: xml-apis-ext.jar -> xml-apis-ext-1.4.01.jar So packages can use "any version of xml-apis-ext.jar", for example. For explicit version dependencies, you would have a versioned depend on the package, obviously. Most of the version dependencies are a ">= x.y" type, a few are of the type "< z" (when e.g. an API changes for a major version). When it is known that a package breaks API compatibility, the distributions should take care to make them installable at the same time. For example GNU trove 2 and GNU trove 3 are not API compatible. Debian ships them as "trove.jar -> trove-2.x.y.jar" and "trove-3.jar -> trove-3.0.3.jar" symlinks. So far, the packages depending on trove 2 or trove 3 continue to work... .bq java-wrappers I believe they allow apps to specify e.g. "java6", "java7" and the wrappers may choose a different java runtime than the system default. A typical java wrappers script looks like this: #!/bin/sh . /usr/lib/java-wrappers/java-wrappers.sh find_java_runtime openjdk6 sun6 find_jars app batik fop run_java mainclass "$@" Where find_jars will take care of setting up the classpath. I havn't looked into the details of how you would specify a versioned requirement. With trove, you would use trove-3.jar. Furthermore, many jars may already include other jars - when in the system folder, so it actually works well with the debian installed jars - via Class-path attribute in the manifest. Ideally, jars in Debian are packaged with such dependencies. For example fop.jar specifies commons-io.jar xercesImpl.jar xalan2.jar etc. Above example could even be simplified: batik is needed by fop, so we could leave it away. Debian also ships some projects split into numerous smaller jar files. Batik is a good example. There is batik-all containing all of batik, but there are also smaller jars containing e.g. only the parser. So that a project that tries to reduce memory requirements can also just load that part of the batik into the classpath that is needed. It's probably not perfect - the debian java team seems to be a bit underpowered, as so often (they for sure currently do not have the power to do Hadoop packages) - but they do seem to work on a manageable java ecosystem. Often such infrastructure things need some users to spread across distributions. I don't know what redhat has for managing java, maybe more, maybe less. The "alternatives" thing is a good example of infrastructure utilities adopted across distributions over time. Quoting from an internet page: .bq Fedora's implementation of alternatives is a rewrite and extension of the alternatives system used in Debian. So if java-wrappers are useful for Bigtop, it may be very manageable to have them adopted by the Fedora ecosystem; while bigtop-utils is not yet adopted by either I guess.
          Hide
          Roman Shaposhnik added a comment -

          so with a 0 patching policy this means you cannot build Hadoop Pipes on current Debian

          No disagreement there – in a patch discussion I'd like to separate policy from capability. IOW, I complete agree that having a capability that would allow you to manage patches via the same infrastructure that Bigtop provides would be extremely helpful to projects like Debian. Given the Bigtop cross-distro nature I would like this capability to be cross-distro as well, but perhaps we can map it efficiently to Debian/RPM toolset. The question here, of course, is who would be doing the actual work And that's where we get back to a policy discussion – as a matter of policy Bigtop doesn't do patches for OUR binary artifacts (the actual DEB/RPM packages that we publish) hence there's not much incentive for us to invest, but we'd love for this contribution to come from some of our community members (hint-hint )

          conflicting library versions (And in some cases, it is easiest to patch (or recompile, for binary packages) some dependant software, to only have to provide one version of a library)

          Unfortunately our experience has been that it is incredibly difficult to harmonize the versions of jars across such a gigantic stack that Hadoop ecosystem ended up being. It basically comes down to things downright breaking if you try to substitute versions.

          Debian java packaging already manages a symlink farm of the type:

          Wait, are you saying that it is possible to install as many versions of foo.jar on Debian as I want? If so, please elaborate.

          A typical java wrappers script looks like this

          I'm going to take a look. Stay tuned.

          Show
          Roman Shaposhnik added a comment - so with a 0 patching policy this means you cannot build Hadoop Pipes on current Debian No disagreement there – in a patch discussion I'd like to separate policy from capability. IOW, I complete agree that having a capability that would allow you to manage patches via the same infrastructure that Bigtop provides would be extremely helpful to projects like Debian. Given the Bigtop cross-distro nature I would like this capability to be cross-distro as well, but perhaps we can map it efficiently to Debian/RPM toolset. The question here, of course, is who would be doing the actual work And that's where we get back to a policy discussion – as a matter of policy Bigtop doesn't do patches for OUR binary artifacts (the actual DEB/RPM packages that we publish) hence there's not much incentive for us to invest, but we'd love for this contribution to come from some of our community members (hint-hint ) conflicting library versions (And in some cases, it is easiest to patch (or recompile, for binary packages) some dependant software, to only have to provide one version of a library) Unfortunately our experience has been that it is incredibly difficult to harmonize the versions of jars across such a gigantic stack that Hadoop ecosystem ended up being. It basically comes down to things downright breaking if you try to substitute versions. Debian java packaging already manages a symlink farm of the type: Wait, are you saying that it is possible to install as many versions of foo.jar on Debian as I want? If so, please elaborate. A typical java wrappers script looks like this I'm going to take a look. Stay tuned.
          Hide
          James Page added a comment -

          Conversion of hadoop package to source format 3.0

          Show
          James Page added a comment - Conversion of hadoop package to source format 3.0
          Hide
          James Page added a comment -

          Roman

          See attached patch for conversion of Hadoop debian packaging to source format 3.0; Broadley:

          1) I bumped the debhelper standard >= 7 (to support use of override_XX in debian/rules).
          2) I switched the package to source/format 3.0 (means that a debian.tar.gz get built instead of a diff)
          3) I refactored debian/rules to use overrides

          This did flush out a couple of issues in the packaging.

          Hopefully that will give you an idea of whats required.

          One thing I did notice is that Architecture: all packages are shipping native libraries and binaries (due to dropping of the hadoop-native package?). This is generally not a good idea. I think that this may be causing issues with the automated library dependency generation - but I need to take a closer look.

          I expected hadoop to have deps on zlib, snappy etc but they are missing - this is the same as in the deb's published on S3 for the 0.4.0 release AFAICT.

          I can keep chugging through these but my next few weeks are very busy due to Ubuntu release and planning commitments.

          Show
          James Page added a comment - Roman See attached patch for conversion of Hadoop debian packaging to source format 3.0; Broadley: 1) I bumped the debhelper standard >= 7 (to support use of override_XX in debian/rules). 2) I switched the package to source/format 3.0 (means that a debian.tar.gz get built instead of a diff) 3) I refactored debian/rules to use overrides This did flush out a couple of issues in the packaging. Hopefully that will give you an idea of whats required. One thing I did notice is that Architecture: all packages are shipping native libraries and binaries (due to dropping of the hadoop-native package?). This is generally not a good idea. I think that this may be causing issues with the automated library dependency generation - but I need to take a closer look. I expected hadoop to have deps on zlib, snappy etc but they are missing - this is the same as in the deb's published on S3 for the 0.4.0 release AFAICT. I can keep chugging through these but my next few weeks are very busy due to Ubuntu release and planning commitments.
          Hide
          Roman Shaposhnik added a comment -

          James, think a million for the patch – it definitely provides a beautiful example of how to do the transition for the rest of our packages. I would be extremely helpful if you could spare a few cycles to address the issues with Architecture: and deps so that we can have hadoop DEB as a model for the rest of this conversion.

          That said, for simpler packages (Mahout, Pig, etc.) we already have a very nice blueprint in what you've done here. I'll try to convert a couple shortly, but it would be very nice if other members of the community can help with it as well.

          Show
          Roman Shaposhnik added a comment - James, think a million for the patch – it definitely provides a beautiful example of how to do the transition for the rest of our packages. I would be extremely helpful if you could spare a few cycles to address the issues with Architecture: and deps so that we can have hadoop DEB as a model for the rest of this conversion. That said, for simpler packages (Mahout, Pig, etc.) we already have a very nice blueprint in what you've done here. I'll try to convert a couple shortly, but it would be very nice if other members of the community can help with it as well.
          Hide
          Roman Shaposhnik added a comment -

          Attaching a patch that converts all of the bigtop-* packages. This leaves us with 12 packages to go

          Show
          Roman Shaposhnik added a comment - Attaching a patch that converts all of the bigtop-* packages. This leaves us with 12 packages to go
          Hide
          Sean Mackrory added a comment -

          I can work on the remaining 12...

          Show
          Sean Mackrory added a comment - I can work on the remaining 12...
          Hide
          Sean Mackrory added a comment - - edited

          I haven't thoroughly tested these (just spot-checked a few packages), but this should work for updating zookeeper, hbase, pig, hive, sqoop, and giraph.

          On my first attempt I ran into some difficulties with hue, oozie, flume, whirr, mahout and datafu. I'll keep working on those unless somebody else posts patches before I get to it...

          6 more to go!

          Show
          Sean Mackrory added a comment - - edited I haven't thoroughly tested these (just spot-checked a few packages), but this should work for updating zookeeper, hbase, pig, hive, sqoop, and giraph. On my first attempt I ran into some difficulties with hue, oozie, flume, whirr, mahout and datafu. I'll keep working on those unless somebody else posts patches before I get to it... 6 more to go!
          Hide
          Anatoli Fomenko added a comment -

          I'll look into the remaining 6.

          Show
          Anatoli Fomenko added a comment - I'll look into the remaining 6.
          Hide
          Sean Mackrory added a comment -

          Just updated my patch - I moved the new `format` files to the source/ subdirectory.

          Show
          Sean Mackrory added a comment - Just updated my patch - I moved the new `format` files to the source/ subdirectory.
          Hide
          Anatoli Fomenko added a comment -

          I added a patch for datafu, flume, hue, mahout, oozie and whirr.

          Show
          Anatoli Fomenko added a comment - I added a patch for datafu, flume, hue, mahout, oozie and whirr.
          Hide
          Roman Shaposhnik added a comment -

          +1 on all the patches I've seen so far. I've collected them together and made available on the github: https://github.com/rvs/bigtop/tree/master (https://github.com/rvs/bigtop/commit/4d5f49eaa4f6103ecbbc58434ef5d3486947fff0) I have also built the entire Bigtop on lucid/precise and the binaries are now available from here: http://bigtop01.cloudera.org:8080/job/Bigtop-BIGTOP-713/label=lucid/lastSuccessfulBuild/artifact/output/bigtop.list and here http://bigtop01.cloudera.org:8080/job/Bigtop-BIGTOP-713/label=precise/lastSuccessfulBuild/artifact/output/bigtop.list

          It would be very appreciated if folks can go over this patch one more time and also tried installing the binaries using the list files I mentioned above.

          I really would like to commit this some time next week, since it makes our Deb code way cleaner.

          Show
          Roman Shaposhnik added a comment - +1 on all the patches I've seen so far. I've collected them together and made available on the github: https://github.com/rvs/bigtop/tree/master ( https://github.com/rvs/bigtop/commit/4d5f49eaa4f6103ecbbc58434ef5d3486947fff0 ) I have also built the entire Bigtop on lucid/precise and the binaries are now available from here: http://bigtop01.cloudera.org:8080/job/Bigtop-BIGTOP-713/label=lucid/lastSuccessfulBuild/artifact/output/bigtop.list and here http://bigtop01.cloudera.org:8080/job/Bigtop-BIGTOP-713/label=precise/lastSuccessfulBuild/artifact/output/bigtop.list It would be very appreciated if folks can go over this patch one more time and also tried installing the binaries using the list files I mentioned above. I really would like to commit this some time next week, since it makes our Deb code way cleaner.
          Hide
          Erich Schubert added a comment -

          The patch looks good (I could build 2.0.2 packages for Debian) and clearly is an improvement. As such, I recommend committing it.

          Apart from the .jar mess, which still persists, I also cannot get the mapred history daemon running. That probably is not Debian-specific, though. First of all, $HADOOP_MAPRED_LOG_DIR is not defined in the etc/defaults file, so it will default to /usr/lib/something, which is not writable. Furthermore, mr-jobhistory-daemon.sh does not seem to expect a --config parameter (despite the usage saying so). Instead it fails because --config is neither "start" nor "stop".

          Show
          Erich Schubert added a comment - The patch looks good (I could build 2.0.2 packages for Debian) and clearly is an improvement. As such, I recommend committing it. Apart from the .jar mess, which still persists, I also cannot get the mapred history daemon running. That probably is not Debian-specific, though. First of all, $HADOOP_MAPRED_LOG_DIR is not defined in the etc/defaults file, so it will default to /usr/lib/something, which is not writable. Furthermore, mr-jobhistory-daemon.sh does not seem to expect a --config parameter (despite the usage saying so). Instead it fails because --config is neither "start" nor "stop".
          Hide
          Roman Shaposhnik added a comment -

          Current patch has been committed. I'm still keeping this one open to address the feedback from Erich and also give James a chance to take a look Architecture: and deps.

          Show
          Roman Shaposhnik added a comment - Current patch has been committed. I'm still keeping this one open to address the feedback from Erich and also give James a chance to take a look Architecture: and deps.
          Hide
          Anatoli Fomenko added a comment -

          Added patch for MAPREDUCE-4814 workaround for historyserver.

          Show
          Anatoli Fomenko added a comment - Added patch for MAPREDUCE-4814 workaround for historyserver.
          Hide
          Roman Shaposhnik added a comment -

          Thanks a million Anatoli! +1 and commited!

          Show
          Roman Shaposhnik added a comment - Thanks a million Anatoli! +1 and commited!
          Hide
          Anatoli Fomenko added a comment -

          As MAPREDUCE-4814 has been resolved upstream, the workaround for this issue in the historyserver needs to be removed in 0.6.

          Show
          Anatoli Fomenko added a comment - As MAPREDUCE-4814 has been resolved upstream, the workaround for this issue in the historyserver needs to be removed in 0.6.

            People

            • Assignee:
              Anatoli Fomenko
              Reporter:
              Erich Schubert
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development