Uploaded image for project: 'Bigtop'
  1. Bigtop
  2. BIGTOP-1893

Compilation of hadoop-yarn-client failed

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0, 1.1.0
    • Fix Version/s: 1.0.0, 1.1.0
    • Component/s: build
    • Labels:
      None
    • Environment:

      bigtop commit eb3ebb535abee15fc37b4c333ce865686fccaa87 (current master)

      Description

      When I tried buid rpm packages of bigtop distribution on current master via gradle:

      ./gradlew rpm
      

      Compilation of component hadoop-yarn-project fails:

      [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile (default-testCompile) on project hadoop-yarn-client: Compilation failure
      [ERROR] /opt/bigtop/build/hadoop/rpm/BUILD/hadoop-2.6.0-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java:[70,30] error: package org.jboss.netty.logging does not exist
      
      referenced line in TestYarnCLI.java
      import org.jboss.netty.logging.CommonsLoggerFactory;
      
      1. dont-mess-with-maven-cache.1.diff
        0.8 kB
        Olaf Flebbe
      2. BIGTOP-1893.1.patch
        2 kB
        Olaf Flebbe
      3. BIGTOP-1893.2.patch
        2 kB
        Olaf Flebbe

        Activity

        Hide
        oflebbe Olaf Flebbe added a comment -

        You are right with your explanation of my comments. I still thinking about the right way to do it (tm)

        Show
        oflebbe Olaf Flebbe added a comment - You are right with your explanation of my comments. I still thinking about the right way to do it (tm)
        Hide
        evans_ye Evans Ye added a comment -

        Committed to master and 1.0 branch.
        Thank you all for the investigation and the patch!

        Show
        evans_ye Evans Ye added a comment - Committed to master and 1.0 branch. Thank you all for the investigation and the patch!
        Hide
        evans_ye Evans Ye added a comment -

        Thanks for the explaination and review. +1 I'll commit this in mins.

        Show
        evans_ye Evans Ye added a comment - Thanks for the explaination and review. +1 I'll commit this in mins.
        Hide
        jonathak Jonathan Kelly added a comment -

        Correct, that's the fix for now in Bigtop. The patch will no longer be needed once Bigtop is upgraded to Hadoop 2.7.0, which includes YARN-2301 (which fixes this issue indirectly due to an import cleanup occurring in that file as part of some other changes).

        I think Olaf's comment regarding addressing this in a separate JIRA probably means the larger task of no longer installing artifacts into the local Maven repository. This makes sense to me, though I do have some slight concerns because my team currently relies on this (since we override the built versions of some packages in order to include some patches that we have not yet contributed back). This is the entire reason we added the ability to specify a Git repo for a package rather than always downloading a source tarball for a released version. Maybe we can make installing built packages to the local Maven cache configurable (and disabled by default)?

        +1 from me on the patch too.

        Show
        jonathak Jonathan Kelly added a comment - Correct, that's the fix for now in Bigtop. The patch will no longer be needed once Bigtop is upgraded to Hadoop 2.7.0, which includes YARN-2301 (which fixes this issue indirectly due to an import cleanup occurring in that file as part of some other changes). I think Olaf's comment regarding addressing this in a separate JIRA probably means the larger task of no longer installing artifacts into the local Maven repository. This makes sense to me, though I do have some slight concerns because my team currently relies on this (since we override the built versions of some packages in order to include some patches that we have not yet contributed back). This is the entire reason we added the ability to specify a Git repo for a package rather than always downloading a source tarball for a released version. Maybe we can make installing built packages to the local Maven cache configurable (and disabled by default)? +1 from me on the patch too.
        Hide
        evans_ye Evans Ye added a comment -

        I've tested this and it works. Just for a clarification: this fix is to remove import of netty's CommonsLoggerFactory because it hasn't been used in the code anyway. Is my understanding correct?

        Show
        evans_ye Evans Ye added a comment - I've tested this and it works. Just for a clarification: this fix is to remove import of netty's CommonsLoggerFactory because it hasn't been used in the code anyway. Is my understanding correct?
        Hide
        oflebbe Olaf Flebbe added a comment -

        Fixed Typo, tested on centos7 and debian

        Show
        oflebbe Olaf Flebbe added a comment - Fixed Typo, tested on centos7 and debian
        Hide
        oflebbe Olaf Flebbe added a comment -

        TYPO in patch

        Show
        oflebbe Olaf Flebbe added a comment - TYPO in patch
        Hide
        oflebbe Olaf Flebbe added a comment -

        This is a quick workaround. The real problem will be handled in a seperate JIRA.

        Show
        oflebbe Olaf Flebbe added a comment - This is a quick workaround. The real problem will be handled in a seperate JIRA.
        Hide
        oflebbe Olaf Flebbe added a comment -

        Will propose a workaround. First I will rename the JIRA, since the diagnosis is not 100% correct.

        Show
        oflebbe Olaf Flebbe added a comment - Will propose a workaround. First I will rename the JIRA, since the diagnosis is not 100% correct.
        Hide
        evans_ye Evans Ye added a comment -

        I must to admit that I don't have to much knowledge on this, but AFAIK we do not have a case that is a component need to be built before building another component. So, unless their are other reasons, installing jars into maven cache seems odd to me.
        Maybe experts such as Konstantin Boudnik or Roman Shaposhnik can describe us more context?

        Show
        evans_ye Evans Ye added a comment - I must to admit that I don't have to much knowledge on this, but AFAIK we do not have a case that is a component need to be built before building another component. So, unless their are other reasons, installing jars into maven cache seems odd to me. Maybe experts such as Konstantin Boudnik or Roman Shaposhnik can describe us more context?
        Hide
        oflebbe Olaf Flebbe added a comment -

        You are right, it is the zookeeper .. but not in a way you have imagined before, I suspect.

        Show
        oflebbe Olaf Flebbe added a comment - You are right, it is the zookeeper .. but not in a way you have imagined before, I suspect.
        Hide
        oflebbe Olaf Flebbe added a comment - - edited

        Oh holy mess ... I didn't like this part of bigtop before and now it strikes right into the build..

        bigtop installs some artifacts into the maven cache. It appears to me that somehow our zookeeper build is different from the one stored at maven central ....

        I would propose to remove all messing with the maven cache from bigtop. There are a few other lines like that in do-component-builds.

        Evans Ye what do you think?

        Show
        oflebbe Olaf Flebbe added a comment - - edited Oh holy mess ... I didn't like this part of bigtop before and now it strikes right into the build.. bigtop installs some artifacts into the maven cache. It appears to me that somehow our zookeeper build is different from the one stored at maven central .... I would propose to remove all messing with the maven cache from bigtop. There are a few other lines like that in do-component-builds. Evans Ye what do you think?
        Hide
        oflebbe Olaf Flebbe added a comment - - edited

        This triggers the bug... Why it does not trigger in docker is now clear.

        See attached file ...

        Show
        oflebbe Olaf Flebbe added a comment - - edited This triggers the bug... Why it does not trigger in docker is now clear. See attached file ...
        Hide
        jonathak Jonathan Kelly added a comment -

        That's good to know that it's reproducible on a fresh debian instance. I'd be interested to know if reverting the upgrade from Zookeeper 3.4.5 to 3.4.6 makes the issue go away. I don't have any specific reason to suspect the Zookeeper upgrade though, other than for the fact that my team had not hit this issue at all until we started using the latest from bigtop master, which included the Zookeeper upgrade, as I mentioned in a comment above.

        Show
        jonathak Jonathan Kelly added a comment - That's good to know that it's reproducible on a fresh debian instance. I'd be interested to know if reverting the upgrade from Zookeeper 3.4.5 to 3.4.6 makes the issue go away. I don't have any specific reason to suspect the Zookeeper upgrade though, other than for the fact that my team had not hit this issue at all until we started using the latest from bigtop master, which included the Zookeeper upgrade, as I mentioned in a comment above.
        Hide
        oflebbe Olaf Flebbe added a comment -

        At least I was able to reproduce it on a fresh debian with a clean ~/.m2

        Looking into it

        Show
        oflebbe Olaf Flebbe added a comment - At least I was able to reproduce it on a fresh debian with a clean ~/.m2 Looking into it
        Hide
        jonathak Jonathan Kelly added a comment -

        Oh, wow, I hadn't noticed that that import isn't even used. =P

        Show
        jonathak Jonathan Kelly added a comment - Oh, wow, I hadn't noticed that that import isn't even used. =P
        Hide
        mbukatov Martin Bukatovic added a comment -

        Yes, I checked the hadoop git repository and checked output of git log -p TestYarnCLI.java.

        It's quite unfortunate that this bug is breaking build process of Hadoop stable release. Quick patch which would remove that single line would solve this, but if we follow the Bigtop tenet of not touching upstream code, we don't have much easy options. And since I have no detailed knowledge of build process, I have no further suggestions here. Moreover since jenkins build process is not affected, Bigtop may not like to do any weird hacks just because of this I guess.

        Anyway we probably should report back to Hadoop people that theirs build process is not perfect. I always thought that this kind of bug is stupid and easy to catch ...

        Show
        mbukatov Martin Bukatovic added a comment - Yes, I checked the hadoop git repository and checked output of git log -p TestYarnCLI.java . It's quite unfortunate that this bug is breaking build process of Hadoop stable release. Quick patch which would remove that single line would solve this, but if we follow the Bigtop tenet of not touching upstream code, we don't have much easy options. And since I have no detailed knowledge of build process, I have no further suggestions here. Moreover since jenkins build process is not affected, Bigtop may not like to do any weird hacks just because of this I guess. Anyway we probably should report back to Hadoop people that theirs build process is not perfect. I always thought that this kind of bug is stupid and easy to catch ...
        Hide
        jonathak Jonathan Kelly added a comment -

        Yep, it still doesn't make sense to me either, and I also am unsure of exactly what causes it to break again after a while.

        Show
        jonathak Jonathan Kelly added a comment - Yep, it still doesn't make sense to me either, and I also am unsure of exactly what causes it to break again after a while.
        Hide
        jonathak Jonathan Kelly added a comment -

        Some more context regarding what's happening on my team:

        My team currently has several of our own Bigtop commits based on a couple-months-old commit of Bigtop back when it was going to be called 0.9, and I'm trying to rebase those commits onto Bigtop 1.0, but I was running into this issue with hadoop-yarn-client failing to build. What's weird is that our team's 0.9 build is totally stable, even though it's using the same version of Hadoop 2.6.0. My only thought about why I might be hitting this issue with my 1.0-based branch rather than our stable 0.9-based branch is that one difference between my 0.9 and 1.0 branches is the version of Zookeeper--it's 3.4.5 in our 0.9-based branch and 3.4.6 in 1.0. The only reason I think that it's possible that this has something to do with it is because of what I said above about how this netty dependency is coming in transitively from zookeeper's test-jar, whereas hadoop-yarn-client should probably depend directly on netty if it's going to use this CommonsLoggerFactory class.

        Show
        jonathak Jonathan Kelly added a comment - Some more context regarding what's happening on my team: My team currently has several of our own Bigtop commits based on a couple-months-old commit of Bigtop back when it was going to be called 0.9, and I'm trying to rebase those commits onto Bigtop 1.0, but I was running into this issue with hadoop-yarn-client failing to build. What's weird is that our team's 0.9 build is totally stable, even though it's using the same version of Hadoop 2.6.0. My only thought about why I might be hitting this issue with my 1.0-based branch rather than our stable 0.9-based branch is that one difference between my 0.9 and 1.0 branches is the version of Zookeeper--it's 3.4.5 in our 0.9-based branch and 3.4.6 in 1.0. The only reason I think that it's possible that this has something to do with it is because of what I said above about how this netty dependency is coming in transitively from zookeeper's test-jar, whereas hadoop-yarn-client should probably depend directly on netty if it's going to use this CommonsLoggerFactory class.
        Hide
        jonathak Jonathan Kelly added a comment -

        Awesome, thanks for finding this! I kept trying to find a JIRA or some discussion about this issue but was unable to find anything, so until you cut this JIRA and let me know that I wasn't the only one hitting this, I was beginning to think I was crazy. These JIRAs you've found don't even address the issue directly, so I'm curious to know how you even found them. I'm guessing you must have looked to see what other commits modified TestYarnCLI.java?

        So this is great that we know what caused and fixed the issue, but I'm still not totally sure what we can do to fix this in Bigtop. It's unfortunate that the fix for YARN-2301 doesn't address this issue directly because it means that it's not just a really simple pom.xml fix we can apply to the local Hadoop 2.6.0 source before we build it in do-component-build like we do in several other package builds. The YARN-2301 patch also doesn't apply cleanly to Hadoop 2.6.0, so we can't necessarily use the same method that was used to backport the ZOOKEEPER-1911 patch to Bigtop, unless we create a new YARN-2301 patch that applies cleanly onto Hadoop 2.6.0.

        What are your thoughts?

        Show
        jonathak Jonathan Kelly added a comment - Awesome, thanks for finding this! I kept trying to find a JIRA or some discussion about this issue but was unable to find anything, so until you cut this JIRA and let me know that I wasn't the only one hitting this, I was beginning to think I was crazy. These JIRAs you've found don't even address the issue directly, so I'm curious to know how you even found them. I'm guessing you must have looked to see what other commits modified TestYarnCLI.java? So this is great that we know what caused and fixed the issue, but I'm still not totally sure what we can do to fix this in Bigtop. It's unfortunate that the fix for YARN-2301 doesn't address this issue directly because it means that it's not just a really simple pom.xml fix we can apply to the local Hadoop 2.6.0 source before we build it in do-component-build like we do in several other package builds. The YARN-2301 patch also doesn't apply cleanly to Hadoop 2.6.0, so we can't necessarily use the same method that was used to backport the ZOOKEEPER-1911 patch to Bigtop, unless we create a new YARN-2301 patch that applies cleanly onto Hadoop 2.6.0. What are your thoughts?
        Hide
        evans_ye Evans Ye added a comment -

        Sorry I don't have any clue on this but our nightly CI build for hadoop seems steady. Probably because we're building packages inside Docker container so they don't have ~/.m2 at the very begining.

        Show
        evans_ye Evans Ye added a comment - Sorry I don't have any clue on this but our nightly CI build for hadoop seems steady. Probably because we're building packages inside Docker container so they don't have ~/.m2 at the very begining.
        Hide
        mbukatov Martin Bukatovic added a comment -

        Thanks Jonathan Kelly, your hack worked for me! When I removed ~/.m2/ directory and then rerun ./gradlew rpm, it compiled the component fine and goes on.

        It worked because it now I have the org.jboss.netty.logging in my local maven repo:

        [bigtop@bigtopdev ~]$ find ~/.m2/repository -name 'netty*jar'
        /home/bigtop/.m2/repository/io/netty/netty/3.6.2.Final/netty-3.6.2.Final.jar
        /home/bigtop/.m2/repository/org/jboss/netty/netty/3.2.4.Final/netty-3.2.4.Final.jar
        [bigtop@bigtopdev ~]$ find ~/.m2/repository -name 'netty*jar' | xargs -n 1 jar tvf  | grep CommonsLoggerFactory
           832 Wed Jan 16 12:42:26 EST 2013 org/jboss/netty/logging/CommonsLoggerFactory.class
           832 Mon Feb 07 21:38:44 EST 2011 org/jboss/netty/logging/CommonsLoggerFactory.class
        

        That said, it doesn't make any sense to me.

        Show
        mbukatov Martin Bukatovic added a comment - Thanks Jonathan Kelly , your hack worked for me! When I removed ~/.m2/ directory and then rerun ./gradlew rpm , it compiled the component fine and goes on. It worked because it now I have the org.jboss.netty.logging in my local maven repo: [bigtop@bigtopdev ~]$ find ~/.m2/repository -name 'netty*jar' /home/bigtop/.m2/repository/io/netty/netty/3.6.2.Final/netty-3.6.2.Final.jar /home/bigtop/.m2/repository/org/jboss/netty/netty/3.2.4.Final/netty-3.2.4.Final.jar [bigtop@bigtopdev ~]$ find ~/.m2/repository -name 'netty*jar' | xargs -n 1 jar tvf | grep CommonsLoggerFactory 832 Wed Jan 16 12:42:26 EST 2013 org/jboss/netty/logging/CommonsLoggerFactory.class 832 Mon Feb 07 21:38:44 EST 2011 org/jboss/netty/logging/CommonsLoggerFactory.class That said, it doesn't make any sense to me.
        Hide
        mbukatov Martin Bukatovic added a comment - - edited

        Another look in the git logs and release notes of hadoop releases 2.6.0 and 2.7.0 shows that:

        • YARN-2698 which creates this bug is part of 2.6.0 release
        • YARN-2301 which fixes it is part of 2.7.0 release

        So either there is something seriously wrong with hadoop build process (so that nobody noticed that the project can't be build before the release) or I'm missing something. Moreover Jonathan Kelly 's note about fixing the issue by removing ~/.m2/ directory doesn't show much confidence in the build process.

        Or maybe I don't understand it at all. I'm not an expert when it comes to building large java projects.

        Show
        mbukatov Martin Bukatovic added a comment - - edited Another look in the git logs and release notes of hadoop releases 2.6.0 and 2.7.0 shows that: YARN-2698 which creates this bug is part of 2.6.0 release YARN-2301 which fixes it is part of 2.7.0 release So either there is something seriously wrong with hadoop build process (so that nobody noticed that the project can't be build before the release) or I'm missing something. Moreover Jonathan Kelly 's note about fixing the issue by removing ~/.m2/ directory doesn't show much confidence in the build process. Or maybe I don't understand it at all. I'm not an expert when it comes to building large java projects.
        Hide
        mbukatov Martin Bukatovic added a comment - - edited

        Oh, so this seems to be a bug in Hadoop introduced in YARN-2698 and which has been already fixed in YARN-2301.

        Show
        mbukatov Martin Bukatovic added a comment - - edited Oh, so this seems to be a bug in Hadoop introduced in YARN-2698 and which has been already fixed in YARN-2301 .
        Hide
        jonathak Jonathan Kelly added a comment -

        I just tried blowing away my local Maven repo again, and it "fixed" the issue again. I have no idea what's going on.

        However, I still think that hadoop-yarn-client might need an explicit test dependency on netty, but I have not yet tested this. Should we add a hack to Hadoop's do-component-build script in order to unblock building Hadoop in Bigtop and in parallel cut a JIRA to fix it in the Hadoop project?

        Show
        jonathak Jonathan Kelly added a comment - I just tried blowing away my local Maven repo again, and it "fixed" the issue again. I have no idea what's going on. However, I still think that hadoop-yarn-client might need an explicit test dependency on netty, but I have not yet tested this. Should we add a hack to Hadoop's do-component-build script in order to unblock building Hadoop in Bigtop and in parallel cut a JIRA to fix it in the Hadoop project?
        Hide
        jonathak Jonathan Kelly added a comment -

        Argh, I just hit this again on a new build node, so maybe blowing away my ~/.m2/repository directory didn't actually fix anything. I have no idea how I got it to work temporarily.

        Show
        jonathak Jonathan Kelly added a comment - Argh, I just hit this again on a new build node, so maybe blowing away my ~/.m2/repository directory didn't actually fix anything. I have no idea how I got it to work temporarily.
        Hide
        jonathak Jonathan Kelly added a comment -

        Weird, I saw the same thing a couple days ago, but I could not figure out what could possibly have caused it. However, I blew away my ~/.m2/repository directory, and the build succeeded.

        While I was investigating this issue, I found that the netty test dependency comes transitively through hadoop-yarn-client => zookeeper (test-jar) => netty, so maybe hadoop-yarn-client should depend directly on netty. This would be a Hadoop issue rather than a Bigtop issue though, right?

        Show
        jonathak Jonathan Kelly added a comment - Weird, I saw the same thing a couple days ago, but I could not figure out what could possibly have caused it. However, I blew away my ~/.m2/repository directory, and the build succeeded. While I was investigating this issue, I found that the netty test dependency comes transitively through hadoop-yarn-client => zookeeper (test-jar) => netty, so maybe hadoop-yarn-client should depend directly on netty. This would be a Hadoop issue rather than a Bigtop issue though, right?

          People

          • Assignee:
            oflebbe Olaf Flebbe
            Reporter:
            mbukatov Martin Bukatovic
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development