Details

    • Type: Task Task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      This issue is for discussing pros and cons of moving hbase build to Apache Maven.

      Maven, if you take on its paradigm, does a lot for you. There are also a bunch of nice plugins that do nice reports on state of project; findbugs, that nice plugin where you can give out urls that will resolve to lines in source code (a doxygen-like thing ... I've forgotten its name). Other examples are a docbook plugin that would do the build inline with doc build. We could start up the hbase book using docbook format and the hbase book would ride along with versions.

      As I see it – and its a while since I've done this stuff so things may have since changed – in the way of an easy move to maven is our src/contrib content. Maven would have these as distinct projects pulling in their hbase dependency or, if you wanted to take on the maven subproject notion, then, hbase would be at same level in build as the contribs – it would be a subproject too just built before the others.

      Anyone interested in working on this issue?

      1. findbugs.html
        27 kB
        Paul Smith
      2. findbugs.html
        284 kB
        Paul Smith
      3. HBASE-2099.13.patch
        27 kB
        Paul Smith
      4. HBASE-2099.14.patch
        27 kB
        Paul Smith
      5. HBase Move Script.txt
        5 kB
        Paul Smith
      6. mvn.out
        86 kB
        stack
      7. test-reports.zip
        62 kB
        Paul Smith

        Issue Links

          Activity

          Hide
          stack added a comment -

          I'm going to close this issue though we still have work to do, hudson passed for first time this evening.

          @Steve Thanks for the input. I like the one where we'd clear local cache building a release, just in case. Thanks for the warnings regards sloppy pom – makes sense.

          @Lars and @Paul, thanks for wrestling this one to this stage. Excellent work.

          Show
          stack added a comment - I'm going to close this issue though we still have work to do, hudson passed for first time this evening. @Steve Thanks for the input. I like the one where we'd clear local cache building a release, just in case. Thanks for the warnings regards sloppy pom – makes sense. @Lars and @Paul, thanks for wrestling this one to this stage. Excellent work.
          Hide
          Lars Francke added a comment -

          I really don't want to start a discussion about Maven vs. Ant. I just like the Maven style better. It is true that there is a lot of incorrect meta data and we're in the process of weeding out those dependencies that are really needed. If you use Ant the process is bottom up. You start with nothing and have to figure out what you need. I'd say it depends on what you prefer.

          1. I don't think that's necessary or even a good idea. As you said: Artifacts aren't allowed to change (at least in the central repository) so there is really no need to clean the local repository; it just makes the builds take longer.

          2. Can't say a lot about that except that the Wagon and Cargo plugins make it pretty easy to deploy and run stuff on other machines. But I'm still not sure I can follow you - I haven't yet done any very complex deployment operations (just deployment to a remote Tomcat etc. with Cargo) with Maven but if it can be done with Ant it can be done with Maven Either way: Maven now seems to do everything we need for HBase but Ant and Ivy worked too.

          As to the clean up: The unit tests currently build their stuff in a "build" or "data" directoy. When we abort a build for whatever reasons those might not be cleaned up and they won't be picked up by the clean plugin. This is something I'll have to remember to look at.

          Show
          Lars Francke added a comment - I really don't want to start a discussion about Maven vs. Ant. I just like the Maven style better. It is true that there is a lot of incorrect meta data and we're in the process of weeding out those dependencies that are really needed. If you use Ant the process is bottom up. You start with nothing and have to figure out what you need. I'd say it depends on what you prefer. 1. I don't think that's necessary or even a good idea. As you said: Artifacts aren't allowed to change (at least in the central repository) so there is really no need to clean the local repository; it just makes the builds take longer. 2. Can't say a lot about that except that the Wagon and Cargo plugins make it pretty easy to deploy and run stuff on other machines. But I'm still not sure I can follow you - I haven't yet done any very complex deployment operations (just deployment to a remote Tomcat etc. with Cargo) with Maven but if it can be done with Ant it can be done with Maven Either way: Maven now seems to do everything we need for HBase but Ant and Ivy worked too. As to the clean up: The unit tests currently build their stuff in a "build" or "data" directoy. When we abort a build for whatever reasons those might not be cleaned up and they won't be picked up by the clean plugin. This is something I'll have to remember to look at.
          Hide
          steve_l added a comment -

          No, the metadata is often getting worse. Old stuff stays the same, and if you look at things like commons-logging 1.1.1 its dependency graph includes now pulls in servlets 2.3 while v 1.0.4 didn't.
          http://mvnrepository.com/artifact/commons-logging/commons-logging/1.1.1
          Whatever tool you use, treat the dependency data as a hint rather than something to blindly depend on. Better to explicitly pull in everything you need. And when you release, audit your metadata before you ship, as you aren't allowed to change it.

          1. ~/.m2/repository ~/.maven/repository. For Ivy I'd go for ~/.ivy and ~/.ivy2. You can do a full rm -rf or just purge your artifact tree. Either way, good to do for clean builds, especially the release VM.

          2. The issue is just that when your build process involves creating RPMs and scp-ing them to VMs that you ask the cloud infrastructure for as part of the test run you are on your own, whatever tooling you have to hand.

          3. Yes. Like I said, I'm not taking sides, just emphasising risk.

          Show
          steve_l added a comment - No, the metadata is often getting worse. Old stuff stays the same, and if you look at things like commons-logging 1.1.1 its dependency graph includes now pulls in servlets 2.3 while v 1.0.4 didn't. http://mvnrepository.com/artifact/commons-logging/commons-logging/1.1.1 Whatever tool you use, treat the dependency data as a hint rather than something to blindly depend on. Better to explicitly pull in everything you need. And when you release, audit your metadata before you ship, as you aren't allowed to change it. 1. ~/.m2/repository ~/.maven/repository. For Ivy I'd go for ~/.ivy and ~/.ivy2. You can do a full rm -rf or just purge your artifact tree. Either way, good to do for clean builds, especially the release VM. 2. The issue is just that when your build process involves creating RPMs and scp-ing them to VMs that you ask the cloud infrastructure for as part of the test run you are on your own, whatever tooling you have to hand. 3. Yes. Like I said, I'm not taking sides, just emphasising risk.
          Hide
          Lars Francke added a comment -

          Just as a heads up: HBase trunk has already moved to Maven as of a few days ago. This issue can be closed. We have and had a few follow up issues that deal with the problems that occurred: HBASE-2267, HBASE-2264, HBASE-2254, HBASE-2250

          The blog post you cite is very old and while some of the points still apply a lot of it has been fixed or gotten better since. Bad metadata isn't a problem exclusive to Ivy and Maven though but for Ant too. Even more so as a lot of projects don't clearly specify their dependencies at all. We just can do our best to be good "citizens" and provide good project descriptions - in this case Maven POMs.

          1. The build currently doesn't work under Hudson as a result from the switch from Ivy to Maven but I expect this'll be sorted out in the coming days/weeks. What local cache directories do you mean that need cleaning up? I don't think we currently do any cleaning up after the build so any specifics would be helpful so we can change that.
          2. I'm no quite sure I can follow you there but there are Maven plugins help in the test process in various ways. What exactly are you missing there for HBase? If everything else fails Maven can run Ant tasks fairly easily. I believe we do this as part of our build process now (JspC).
          3. Well...that's obviously true independent of the build system used (Ivy, Maven, Ant, Makefiles, ...)

          tar: Maven handles this the same way: http://maven.apache.org/plugins/maven-assembly-plugin/assembly-mojo.html#tarLongFileMode - in the newest patch I chose the "gnu" mode.

          Show
          Lars Francke added a comment - Just as a heads up: HBase trunk has already moved to Maven as of a few days ago. This issue can be closed. We have and had a few follow up issues that deal with the problems that occurred: HBASE-2267 , HBASE-2264 , HBASE-2254 , HBASE-2250 The blog post you cite is very old and while some of the points still apply a lot of it has been fixed or gotten better since. Bad metadata isn't a problem exclusive to Ivy and Maven though but for Ant too. Even more so as a lot of projects don't clearly specify their dependencies at all. We just can do our best to be good "citizens" and provide good project descriptions - in this case Maven POMs. 1. The build currently doesn't work under Hudson as a result from the switch from Ivy to Maven but I expect this'll be sorted out in the coming days/weeks. What local cache directories do you mean that need cleaning up? I don't think we currently do any cleaning up after the build so any specifics would be helpful so we can change that. 2. I'm no quite sure I can follow you there but there are Maven plugins help in the test process in various ways. What exactly are you missing there for HBase? If everything else fails Maven can run Ant tasks fairly easily. I believe we do this as part of our build process now (JspC). 3. Well...that's obviously true independent of the build system used (Ivy, Maven, Ant, Makefiles, ...) tar: Maven handles this the same way: http://maven.apache.org/plugins/maven-assembly-plugin/assembly-mojo.html#tarLongFileMode - in the newest patch I chose the "gnu" mode.
          Hide
          steve_l added a comment -

          As part of the Ant team you can probably expect me to be -1 to Maven, but I take the view that "whatever suits your needs are best"

          One thing you really, really have to look out for in Maven is getting the POMs right for downstream use. Bad metadata is a problem in both Ivy and M2 builds:
          http://www.1060.org/blogxter/entry?publicid=9EE3794599F42C4E1D9BF2F2FE655180

          The problem is that to get your build to work you just add on as many JARs as you need, but downstream things are different. Audit the POMs, they are now release artifacts of their own!

          1. Make sure that builds under hudson work with the latest version of everything being picked up. Don't be afraid to keep some build.xml files around to clean up all the local cache directories, which are effectively shared state across projects.
          2. Make sure that you can get functional testing to work. Maven is fairly biased towards "classic" webapp dev, not more complex workflows of requesting real/virtual machines from the infrastructure, deploying and testing on them. Ant isn't biased towards either, leaving you to do everything yourself.
          3. Don't have the build something that only one person can understand/maintain. They will become a SPOF.

          On the topic of tar, the Ant docs explain the issue, the main one being that >100 byte filenames are handled differently by different tools. you see, we in the Ant team do care about such problems, which are most likely to occur on non-linux Unix platforms:
          http://ant.apache.org/manual/CoreTasks/tar.html

          Show
          steve_l added a comment - As part of the Ant team you can probably expect me to be -1 to Maven, but I take the view that "whatever suits your needs are best" One thing you really, really have to look out for in Maven is getting the POMs right for downstream use. Bad metadata is a problem in both Ivy and M2 builds: http://www.1060.org/blogxter/entry?publicid=9EE3794599F42C4E1D9BF2F2FE655180 The problem is that to get your build to work you just add on as many JARs as you need, but downstream things are different. Audit the POMs, they are now release artifacts of their own! Make sure that builds under hudson work with the latest version of everything being picked up. Don't be afraid to keep some build.xml files around to clean up all the local cache directories, which are effectively shared state across projects. Make sure that you can get functional testing to work. Maven is fairly biased towards "classic" webapp dev, not more complex workflows of requesting real/virtual machines from the infrastructure, deploying and testing on them. Ant isn't biased towards either, leaving you to do everything yourself. Don't have the build something that only one person can understand/maintain. They will become a SPOF. On the topic of tar, the Ant docs explain the issue, the main one being that >100 byte filenames are handled differently by different tools. you see, we in the Ant team do care about such problems, which are most likely to occur on non-linux Unix platforms: http://ant.apache.org/manual/CoreTasks/tar.html
          Hide
          Paul Smith added a comment -

          If you are happy the resultant layout works fine in your own IDE for you & others then yeah, v14 should be fine.

          Show
          Paul Smith added a comment - If you are happy the resultant layout works fine in your own IDE for you & others then yeah, v14 should be fine.
          Hide
          stack added a comment -

          @Paul NM on the above. The warnings are a little disturbing but we can deal w/ them going forward. I'll commit v14 of the patch?

          Show
          stack added a comment - @Paul NM on the above. The warnings are a little disturbing but we can deal w/ them going forward. I'll commit v14 of the patch?
          Hide
          Paul Smith added a comment -

          in response to Dan's comments:

          any particular reason you specify versions of some plugins (e.g. maven-source-plugin=2.1.1)?

          ummm.... slackness! The proper (ok 'anal') Maven way is to have every single plugin defined in a pom to have an explicit version definition. This is so builds are completely reproducible.
          By not specifying a version, during the build the most recent version is used, so it is possible that right at the moment of producing a full ASF distribution a plugin is updated and perhaps you get hit by a bug in that plugin.

          So I would recommend we explicitly put in a version for all plugins, I think I have a TODO in there already, but it was always my intent, even if it's just locking in the current set of plugin versions. The ones that currently have versions specified is probably a consequence of the fact that their example docs have easily copy/pastable snippets containing version info... :-$

          maybe define the maven-compiler-plugin tweaks in the pluginManagemet section of the pom. That way sub modules can take advantage...

          I must lookup the usage of the pluginManagement section, it's not something I'm intimate with, from experience, the sub-modules already inherit from the top-level pom's build plugin definition (certainly seems to work like that for our own corporate pom).

          Will investigate.

          there is a mention of the "Aconex Snapshots" under distributionManagement...?

          Copy/paste #fail. ahem. thanks...

          could some of the dependencies mentioned in the core pom be optional (e.g. hadoop-mapred, thrift). So third parties depending on hbase core don't have to explicitly exclude stuff...

          yes, that was definitely on my list to investigate, but I don't think I have a TODO in there yet. I initially targeted by just matching the Ivy dependencies (as much as I could, since I'm not familiar with the Ivy syntax). I would definitely suggest we review the output of the
          'mvn dependency:tree' output and go through each jar and say "oh, no, we don't neeeeed that one, lets mark that as optional'.

          I definitely want to do that before the next HBase release if it's done by Maven, we totally didn't do this when log4j switched to Maven during log4j 1.2.15 and a gazilion projects have been paying the price ever since. (I am definitely partly to blame there for sure).

          Show
          Paul Smith added a comment - in response to Dan's comments: any particular reason you specify versions of some plugins (e.g. maven-source-plugin=2.1.1)? ummm.... slackness! The proper (ok 'anal') Maven way is to have every single plugin defined in a pom to have an explicit version definition. This is so builds are completely reproducible. By not specifying a version, during the build the most recent version is used, so it is possible that right at the moment of producing a full ASF distribution a plugin is updated and perhaps you get hit by a bug in that plugin. So I would recommend we explicitly put in a version for all plugins, I think I have a TODO in there already, but it was always my intent, even if it's just locking in the current set of plugin versions. The ones that currently have versions specified is probably a consequence of the fact that their example docs have easily copy/pastable snippets containing version info... :-$ maybe define the maven-compiler-plugin tweaks in the pluginManagemet section of the pom. That way sub modules can take advantage... I must lookup the usage of the pluginManagement section, it's not something I'm intimate with, from experience, the sub-modules already inherit from the top-level pom's build plugin definition (certainly seems to work like that for our own corporate pom). Will investigate. there is a mention of the "Aconex Snapshots" under distributionManagement...? Copy/paste #fail. ahem. thanks... could some of the dependencies mentioned in the core pom be optional (e.g. hadoop-mapred, thrift). So third parties depending on hbase core don't have to explicitly exclude stuff... yes, that was definitely on my list to investigate, but I don't think I have a TODO in there yet. I initially targeted by just matching the Ivy dependencies (as much as I could, since I'm not familiar with the Ivy syntax). I would definitely suggest we review the output of the 'mvn dependency:tree' output and go through each jar and say "oh, no, we don't neeeeed that one, lets mark that as optional'. I definitely want to do that before the next HBase release if it's done by Maven, we totally didn't do this when log4j switched to Maven during log4j 1.2.15 and a gazilion projects have been paying the price ever since. (I am definitely partly to blame there for sure).
          Hide
          Paul Smith added a comment -

          [WARNING] NOTE: Currently, inclusion of module dependencies may produce unpredictable results if a version conflict occurs.

          Ok, I had read up about that exact warning earlier in the week. It is related to the fact that one has to have the 'assembly:assembly' text during a full distribution, that is, to get the full tar ball, you have to invoke the assembly plugin's goal.

          This is because I have configured this assembly in the top-level pom. Apparently this is 'not the preferred way'. Instead, a new sub-module should be created to host this. Basically shift the definition into a, say, 'dist' sub-module.

          I'd like to experiment with this after it's in trunk and I can use git more effectively to publish potential changes for review (if Ryan is ok with me using his github repo, otherwise I could create my own I guess) I think my patch contains 'just enough' to get started, but not enough for an official release (a few other reasons it's not ready yet anyway).

          INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping

          to be honest, no idea. Seems benign though. maybe an errant INFO level log message (perhaps should be DEBUG..). Can follow up with maven-user list if it's annoying.

          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestDeleteCompare.java longer than 100 characters.

          The 'official' tar spec does not support filenames > 100 chars. In practice, the tar ball works fine. I've seen this plenty of times during our own Aconex builds and not a single server has yet failed to unpack it. It's perhaps targetted to warn for really old or esoteric Unix distributions.

          May even be able to turn it off via config, not sure yet.

          Oh, and how do I build the package? I can't. I have to go back to my mvn checkout, is that right?

          i have no idea what you mean here... sorry.

          Show
          Paul Smith added a comment - [WARNING] NOTE: Currently, inclusion of module dependencies may produce unpredictable results if a version conflict occurs. Ok, I had read up about that exact warning earlier in the week. It is related to the fact that one has to have the 'assembly:assembly' text during a full distribution, that is, to get the full tar ball, you have to invoke the assembly plugin's goal. This is because I have configured this assembly in the top-level pom. Apparently this is 'not the preferred way'. Instead, a new sub-module should be created to host this. Basically shift the definition into a, say, 'dist' sub-module. I'd like to experiment with this after it's in trunk and I can use git more effectively to publish potential changes for review (if Ryan is ok with me using his github repo, otherwise I could create my own I guess) I think my patch contains 'just enough' to get started, but not enough for an official release (a few other reasons it's not ready yet anyway). INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping to be honest, no idea. Seems benign though. maybe an errant INFO level log message (perhaps should be DEBUG..). Can follow up with maven-user list if it's annoying. [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestDeleteCompare.java longer than 100 characters. The 'official' tar spec does not support filenames > 100 chars. In practice, the tar ball works fine. I've seen this plenty of times during our own Aconex builds and not a single server has yet failed to unpack it. It's perhaps targetted to warn for really old or esoteric Unix distributions. May even be able to turn it off via config, not sure yet. Oh, and how do I build the package? I can't. I have to go back to my mvn checkout, is that right? i have no idea what you mean here... sorry.
          Hide
          Dan Washusen added a comment -

          Can't wait!

          Few question/comments;

          • any particular reason you specify versions of some plugins (e.g. maven-source-plugin=2.1.1)?
          • maybe define the maven-compiler-plugin tweaks in the pluginManagemet section of the pom. That way sub modules can take advantage...
          • there is a mention of the "Aconex Snapshots" under distributionManagement...?
          • could some of the dependencies mentioned in the core pom be optional (e.g. hadoop-mapred, thrift). So third parties depending on hbase core don't have to explicitly exclude stuff...
          Show
          Dan Washusen added a comment - Can't wait! Few question/comments; any particular reason you specify versions of some plugins (e.g. maven-source-plugin=2.1.1)? maybe define the maven-compiler-plugin tweaks in the pluginManagemet section of the pom. That way sub modules can take advantage... there is a mention of the "Aconex Snapshots" under distributionManagement...? could some of the dependencies mentioned in the core pom be optional (e.g. hadoop-mapred, thrift). So third parties depending on hbase core don't have to explicitly exclude stuff...
          Hide
          stack added a comment -

          Paul, so, now jars are showing up in the tarball. Good stuff. The Talisker must have done the job.

          What about these warnings....

          [WARNING] NOTE: Currently, inclusion of module dependencies may produce unpredictable results if a version conflict occurs.
          [WARNING] NOTE: Currently, inclusion of module dependencies may produce unpredictable results if a version conflict occurs.
          [WARNING] NOTE: Currently, inclusion of module dependencies may produce unpredictable results if a version conflict occurs.
          

          ... and

          [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping
          [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping
          [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping
          [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping
          [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping
          [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping
          [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping
          [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, 
          

          ... and

          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestDeleteCompare.java longer than 100 characters.
          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java longer than 100 characters.
          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java longer than 100 characters.
          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestGetDeleteTracker.java longer than 100 characters.
          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueScanFixture.java longer than 100 characters.
          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueSkipListSet.java longer than 100 characters.
          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinorCompactingStoreScanner.java longer than 100 characters.
          

          Otherwise, it runs nice and prompt.

          Oh, and how do I build the package? I can't. I have to go back to my mvn checkout, is that right?

          Show
          stack added a comment - Paul, so, now jars are showing up in the tarball. Good stuff. The Talisker must have done the job. What about these warnings.... [WARNING] NOTE: Currently, inclusion of module dependencies may produce unpredictable results if a version conflict occurs. [WARNING] NOTE: Currently, inclusion of module dependencies may produce unpredictable results if a version conflict occurs. [WARNING] NOTE: Currently, inclusion of module dependencies may produce unpredictable results if a version conflict occurs. ... and [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, skipping [INFO] contrib/transactional/hbase-contrib-transactional-0.20.2-SNAPSHOT.jar already added, ... and [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestDeleteCompare.java longer than 100 characters. [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java longer than 100 characters. [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java longer than 100 characters. [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestGetDeleteTracker.java longer than 100 characters. [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueScanFixture.java longer than 100 characters. [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueSkipListSet.java longer than 100 characters. [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinorCompactingStoreScanner.java longer than 100 characters. Otherwise, it runs nice and prompt. Oh, and how do I build the package? I can't. I have to go back to my mvn checkout, is that right?
          Hide
          Paul Smith added a comment -

          Latest patch. only difference between v13 and this one is the src/assembly/bin.xml file which now contains the correct groupIds that we all agreed on.

          Full Disclosure: 2 nips of Talisker (Isle of Skye) were utilised in the development of this patch.

          Show
          Paul Smith added a comment - Latest patch. only difference between v13 and this one is the src/assembly/bin.xml file which now contains the correct groupIds that we all agreed on. Full Disclosure: 2 nips of Talisker (Isle of Skye) were utilised in the development of this patch.
          Hide
          Paul Smith added a comment -

          FACE PALM

          oh crap. I forget the appassembler descriptor change when we changed from org.apache.hadoop.hbase -> org.apache.hbase.

          oh so trivial change, new patch coming.

          Man, I'm so sorry, I feel like a complete tool.

          Show
          Paul Smith added a comment - FACE PALM oh crap. I forget the appassembler descriptor change when we changed from org.apache.hadoop.hbase -> org.apache.hbase. oh so trivial change, new patch coming. Man, I'm so sorry, I feel like a complete tool.
          Hide
          stack added a comment -

          Paul, any chance of you taking a look at this mvn.out file?

          Its doing a bunch of the right things, fast, thats great... but see the WARNINGs on the end about file length. E.g:

          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestWildcardColumnTracker.java longer than 100 characters.
          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/transactional/DisabledTestHLogRecovery.java longer than 100 characters.
          [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/transactional/DisabledTestTransactionalHLogManager.java longer than 100 characters.

          This is macosx.

          And it seems like the resultant tar.gz is missing stuff when all is done, no jars make it into the bundle:

          pynchon-2:x stack$ ls -la
          total 128
          drwxr-xr-x   9 stack  staff    306 Feb 18 22:46 .
          drwxr-xr-x  10 stack  staff    340 Feb 18 22:46 ..
          -rw-r--r--   1 stack  staff  11358 Jan 21 16:01 LICENSE.txt
          -rw-r--r--   1 stack  staff    293 Jan 21 16:01 NOTICE.txt
          -rw-r--r--   1 stack  staff     43 Jan 21 16:01 README.txt
          drwxr-xr-x  19 stack  staff    646 Feb 18 13:51 bin
          drwxr-xr-x   8 stack  staff    272 Feb 18 21:59 conf
          drwxr-xr-x   4 stack  staff    136 Feb 18 22:46 contrib
          -rw-r--r--   1 stack  staff  43090 Feb 18 22:38 hbase-0.20.2-SNAPSHOT-bin.tar.gz
          pynchon-2:x stack$ find . -name *jar
          pynchon-2:x stack$ 
          

          BTW, your instructions are amazing Paul. Even a gorilla like myself is able to follow the bouncing ball.

          Show
          stack added a comment - Paul, any chance of you taking a look at this mvn.out file? Its doing a bunch of the right things, fast, thats great... but see the WARNINGs on the end about file length. E.g: [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/TestWildcardColumnTracker.java longer than 100 characters. [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/transactional/DisabledTestHLogRecovery.java longer than 100 characters. [WARNING] Entry: hbase-0.20.2-SNAPSHOT/core/src/test/java/org/apache/hadoop/hbase/regionserver/transactional/DisabledTestTransactionalHLogManager.java longer than 100 characters. This is macosx. And it seems like the resultant tar.gz is missing stuff when all is done, no jars make it into the bundle: pynchon-2:x stack$ ls -la total 128 drwxr-xr-x 9 stack staff 306 Feb 18 22:46 . drwxr-xr-x 10 stack staff 340 Feb 18 22:46 .. -rw-r--r-- 1 stack staff 11358 Jan 21 16:01 LICENSE.txt -rw-r--r-- 1 stack staff 293 Jan 21 16:01 NOTICE.txt -rw-r--r-- 1 stack staff 43 Jan 21 16:01 README.txt drwxr-xr-x 19 stack staff 646 Feb 18 13:51 bin drwxr-xr-x 8 stack staff 272 Feb 18 21:59 conf drwxr-xr-x 4 stack staff 136 Feb 18 22:46 contrib -rw-r--r-- 1 stack staff 43090 Feb 18 22:38 hbase-0.20.2-SNAPSHOT-bin.tar.gz pynchon-2:x stack$ find . -name *jar pynchon-2:x stack$ BTW, your instructions are amazing Paul. Even a gorilla like myself is able to follow the bouncing ball.
          Hide
          Paul Smith added a comment -

          oh, and all the groupId's have been changed to org.apache.hbase as requested a little while ago.

          Show
          Paul Smith added a comment - oh, and all the groupId's have been changed to org.apache.hbase as requested a little while ago.
          Hide
          Paul Smith added a comment -

          latest patch and plan.

          • Added developers section in the top-level pom, taken from the credits.html page from the current HBase site. This will be output during a 'mvn site' (copy for review to be shown in another post, may have to be tomorrow)
          • Relocated the webapps directory into core/src/resources - looks like this directory is needed inside the jar, as some unit tests wouldn't pass without access to these directories. realigned the jspc compilation so that it can pick up these files in this directory now too.
          • Added a specific exclude for running unit tests because there's a class in there that isn't an actual JUnit test (it's SoftValueSortedMapTest.java if you're intrigued). Be nicer to either remove this class or move it as a 'test harness' in core/src/main/java rather than a test, would keep the configuration simpler.
          • Plan document now includes specifically marking 'target' as an excluded directory under svn:ignore.
          Show
          Paul Smith added a comment - latest patch and plan. Added developers section in the top-level pom, taken from the credits.html page from the current HBase site. This will be output during a 'mvn site' (copy for review to be shown in another post, may have to be tomorrow) Relocated the webapps directory into core/src/resources - looks like this directory is needed inside the jar, as some unit tests wouldn't pass without access to these directories. realigned the jspc compilation so that it can pick up these files in this directory now too. Added a specific exclude for running unit tests because there's a class in there that isn't an actual JUnit test (it's SoftValueSortedMapTest.java if you're intrigued). Be nicer to either remove this class or move it as a 'test harness' in core/src/main/java rather than a test, would keep the configuration simpler. Plan document now includes specifically marking 'target' as an excluded directory under svn:ignore.
          Hide
          Karthik K added a comment -

          When publishing artifacts - it would be nice to publish a client-side only artifact to link against , as specified by HBASE-2170 , to minimize the volume of dependencies coming in that might be potentially unnecessary for a Put/Scan scenarion.

          Show
          Karthik K added a comment - When publishing artifacts - it would be nice to publish a client-side only artifact to link against , as specified by HBASE-2170 , to minimize the volume of dependencies coming in that might be potentially unnecessary for a Put/Scan scenarion.
          Hide
          Karthik K added a comment -

          Group Id finalized: org.apache.hbase ( INFRA-2461 for more details).

          Paul -
          can you help incorporate this as well.

          Show
          Karthik K added a comment - Group Id finalized: org.apache.hbase ( INFRA-2461 for more details). Paul - can you help incorporate this as well.
          Hide
          Paul Smith added a comment -

          Yes. As far as zookeeper is concerned - it will be available soon in a maven snapshot repository with the next release. As far as thrift is concerned - we can publish ourselves once a groupId is identified for hbase.

          The Maven Release plugin is absolutely brutally anal about snapshot dependencies though. It will not let you 'mvn release' a project such as HBase if that project depends on snapshot artifacts. There's good reasons for it, but there's no flexibility built in. I'm hoping that Zookeeper can be deployed to a non-snapshot status (which is sort of by definition maven central..)

          I'm sure that since Zookeeper is pushing it's own releases, it won't take too much effort to get Zookeeper 3.2.2 in Maven central.

          Thrift is still the sticking point, the fact it's an incubation project makes life difficult. That doesn't mean you can't deploy a non-snapshot version of the artifact to a known repository (even Stack's own one), you just won't be able to take advantage of the Nexus staging and deployment into central that Sonatype is offering.

          Show
          Paul Smith added a comment - Yes. As far as zookeeper is concerned - it will be available soon in a maven snapshot repository with the next release. As far as thrift is concerned - we can publish ourselves once a groupId is identified for hbase. The Maven Release plugin is absolutely brutally anal about snapshot dependencies though. It will not let you 'mvn release' a project such as HBase if that project depends on snapshot artifacts. There's good reasons for it, but there's no flexibility built in. I'm hoping that Zookeeper can be deployed to a non-snapshot status (which is sort of by definition maven central..) I'm sure that since Zookeeper is pushing it's own releases, it won't take too much effort to get Zookeeper 3.2.2 in Maven central. Thrift is still the sticking point, the fact it's an incubation project makes life difficult. That doesn't mean you can't deploy a non-snapshot version of the artifact to a known repository (even Stack's own one), you just won't be able to take advantage of the Nexus staging and deployment into central that Sonatype is offering.
          Hide
          Karthik K added a comment -

          . I've relocated the SCM definition I had in hbase-core

          Great. I believe the important distinction is to have connection and developerConnection nodes available under scm , since that would be used by the maven-release plugin.

          unless Zookeeper and Thrift are put somewhere standard soon, you won't pass this one.

          Yes. As far as zookeeper is concerned - it will be available soon in a maven snapshot repository with the next release. As far as thrift is concerned - we can publish ourselves once a groupId is identified for hbase.

          it's unclear what the use of googlecode, java.net and codehaus repositories means for this as well. Seems odd to eliminate us based on that.. ? Could someone in a more sensible timezone chat with some of the sonatype guys or their IRC channel about this? there must be a bucket load of Apache projects that wouldn't qualify there.

          I do not think this is a major showstopper because those original instructions that I had pasted are for artifacts to be uploaded to central maven repository , but when it comes to apache snapshot maven repository ( as it applies for hbase), the criteria are more relaxed. So - that should not be a showstopper I guess. I will confirm this with the sonatype people on this as well.

          Show
          Karthik K added a comment - . I've relocated the SCM definition I had in hbase-core Great. I believe the important distinction is to have connection and developerConnection nodes available under scm , since that would be used by the maven-release plugin. unless Zookeeper and Thrift are put somewhere standard soon, you won't pass this one. Yes. As far as zookeeper is concerned - it will be available soon in a maven snapshot repository with the next release. As far as thrift is concerned - we can publish ourselves once a groupId is identified for hbase. it's unclear what the use of googlecode, java.net and codehaus repositories means for this as well. Seems odd to eliminate us based on that.. ? Could someone in a more sensible timezone chat with some of the sonatype guys or their IRC channel about this? there must be a bucket load of Apache projects that wouldn't qualify there. I do not think this is a major showstopper because those original instructions that I had pasted are for artifacts to be uploaded to central maven repository , but when it comes to apache snapshot maven repository ( as it applies for hbase), the criteria are more relaxed. So - that should not be a showstopper I guess. I will confirm this with the sonatype people on this as well.
          Hide
          Paul Smith added a comment -

          Paul - Can you also help double check the pom.xml-s with this - http://nexus.sonatype.org/oss-repository-hosting.html#4 , that all necessary elements are present.

          Ok, I've gone and added the missing ones which were: Licenses, developers (I just added stack in as an example, I'll need more details, or you can fill it in as a template later), URL, and Description. I've relocated the SCM definition I had in hbase-core to be in the top-level pom, I think that makes more sense there.

          but here's the kicker:

          Make sure your POM does not contain repositories or pluginRepositories element. Central repository must be self-contained, so all your dependencies must be available in Central. You need to make sure your project can be built without extra repositories or pluginRepositories.

          unless Zookeeper and Thrift are put somewhere standard soon, you won't pass this one. it's unclear what the use of googlecode, java.net and codehaus repositories means for this as well. Seems odd to eliminate us based on that.. ? Could someone in a more sensible timezone chat with some of the sonatype guys or their IRC channel about this? there must be a bucket load of Apache projects that wouldn't qualify there.

          Show
          Paul Smith added a comment - Paul - Can you also help double check the pom.xml-s with this - http://nexus.sonatype.org/oss-repository-hosting.html#4 , that all necessary elements are present. Ok, I've gone and added the missing ones which were: Licenses, developers (I just added stack in as an example, I'll need more details, or you can fill it in as a template later), URL, and Description. I've relocated the SCM definition I had in hbase-core to be in the top-level pom, I think that makes more sense there. but here's the kicker: Make sure your POM does not contain repositories or pluginRepositories element. Central repository must be self-contained, so all your dependencies must be available in Central. You need to make sure your project can be built without extra repositories or pluginRepositories. unless Zookeeper and Thrift are put somewhere standard soon, you won't pass this one. it's unclear what the use of googlecode, java.net and codehaus repositories means for this as well. Seems odd to eliminate us based on that.. ? Could someone in a more sensible timezone chat with some of the sonatype guys or their IRC channel about this? there must be a bucket load of Apache projects that wouldn't qualify there.
          Hide
          Paul Smith added a comment -

          But - when we publish an artifact , we can always add sources and javadocs as separate artifacts (for the same version) - right ?? So - binary would just have .class / .jar files as a library to link against, by other apps as needed ?

          ok, I see where we may be on different wave lengths here, my apologize. the short answer, is maven will absolutely definitely have a -sources and -javadoc artifacts produced naturally and pushed to a Maven repo as part of the 'deploy' phase. That's done automatically without any configuration.

          However that common automatic case generally covers purely artifact pushing to, say, Maven central (or more formally a spot which is automatically ingested into central). Most projects like to have a large tar ball with a collection of artifacts put together for the end user (who is perhaps either looking to learn Hbase, or doesn't use Maven/Ivy). This latter case is what the 'tar ball' I've produced via the Maven assembly plugin does. The former case is done automatically during the 'deploy' phase.

          By default, the -sources and -javadoc are not generated during 'package' phase, but it is easily configured to do so if that's what we wish, in fact I've already configured -sources.jar to always be produced during 'package' because I need that as part of the assembly in the end so that it can appear in the tar ball.

          In the maven world, the binary, the sources, and the javadoc are all separate artifacts sharing the same groupId:artifactId:version co-ordinates. the latter 2 just get given a 'classifier' (with the main jar given a 'blank' classifier, implying the main artifact).

          All of this becomes second nature after a while with Maven, and I forget sometimes! sorry!

          Show
          Paul Smith added a comment - But - when we publish an artifact , we can always add sources and javadocs as separate artifacts (for the same version) - right ?? So - binary would just have .class / .jar files as a library to link against, by other apps as needed ? ok, I see where we may be on different wave lengths here, my apologize. the short answer, is maven will absolutely definitely have a -sources and -javadoc artifacts produced naturally and pushed to a Maven repo as part of the 'deploy' phase. That's done automatically without any configuration. However that common automatic case generally covers purely artifact pushing to, say, Maven central (or more formally a spot which is automatically ingested into central). Most projects like to have a large tar ball with a collection of artifacts put together for the end user (who is perhaps either looking to learn Hbase, or doesn't use Maven/Ivy). This latter case is what the 'tar ball' I've produced via the Maven assembly plugin does. The former case is done automatically during the 'deploy' phase. By default, the -sources and -javadoc are not generated during 'package' phase, but it is easily configured to do so if that's what we wish, in fact I've already configured -sources.jar to always be produced during 'package' because I need that as part of the assembly in the end so that it can appear in the tar ball. In the maven world, the binary, the sources, and the javadoc are all separate artifacts sharing the same groupId:artifactId:version co-ordinates. the latter 2 just get given a 'classifier' (with the main jar given a 'blank' classifier, implying the main artifact). All of this becomes second nature after a while with Maven, and I forget sometimes! sorry!
          Hide
          Karthik K added a comment -

          Paul - Can you also help double check the pom.xml-s with this - http://nexus.sonatype.org/oss-repository-hosting.html#4 , that all necessary elements are present.

          For those - that does not make sense , may be you can add a TODO and bring up as appropriate. This would be useful to publish artifacts once this is in.

          Show
          Karthik K added a comment - Paul - Can you also help double check the pom.xml-s with this - http://nexus.sonatype.org/oss-repository-hosting.html#4 , that all necessary elements are present. For those - that does not make sense , may be you can add a TODO and bring up as appropriate. This would be useful to publish artifacts once this is in.
          Hide
          Karthik K added a comment -
          We may get a bit philosophical here, but my own personal view is that a binary download of a release of a project shouldn't need all things needed to build the product - that's what SVN/Git is for. The sources and javadocs are there as reference material for the user of the binary. Even if there was a standalone, in the majority of cases they're there for IDE's to have it added as source material for the artifact for the users project. I can easily change this to have the sources unpacked, in fact the default is to have it unpacked, maybe it's clearer and more generally 'usable' for the new user to see the sources unpacked just waiting for perusal.

          But - when we publish an artifact , we can always add sources and javadocs as separate artifacts (for the same version) - right ?? So - binary would just have .class / .jar files as a library to link against, by other apps as needed ?

          Show
          Karthik K added a comment - We may get a bit philosophical here, but my own personal view is that a binary download of a release of a project shouldn't need all things needed to build the product - that's what SVN/Git is for. The sources and javadocs are there as reference material for the user of the binary. Even if there was a standalone, in the majority of cases they're there for IDE's to have it added as source material for the artifact for the users project. I can easily change this to have the sources unpacked, in fact the default is to have it unpacked, maybe it's clearer and more generally 'usable' for the new user to see the sources unpacked just waiting for perusal. But - when we publish an artifact , we can always add sources and javadocs as separate artifacts (for the same version) - right ?? So - binary would just have .class / .jar files as a library to link against, by other apps as needed ?
          Hide
          Paul Smith added a comment -

          back to some things stack said in another post:

          I like the way you have sources bundled up a jar but I think needs a bit more work. When I undo it, sources are not buildable (fellas like being able to do this). There is no build.xml. Test source should probably be there too.

          We may get a bit philosophical here, but my own personal view is that a binary download of a release of a project shouldn't need all things needed to build the product - that's what SVN/Git is for. The sources and javadocs are there as reference material for the user of the binary. Even if there was a standalone, in the majority of cases they're there for IDE's to have it added as source material for the artifact for the users project. I can easily change this to have the sources unpacked, in fact the default is to have it unpacked, maybe it's clearer and more generally 'usable' for the new user to see the sources unpacked just waiting for perusal.

          I would propose we produce 2 tar balls as is mostly standard, and Maven Assembly plugin has a definition right out of the box for a sources-like tar bal:

          http://maven.apache.org/plugins/maven-assembly-plugin/descriptor-refs.html#src

          (look at both the 'sources' descriptor and the one below it 'project'. stack may be thinking of 'project'. I'll add both so you can see.

          Show
          Paul Smith added a comment - back to some things stack said in another post: I like the way you have sources bundled up a jar but I think needs a bit more work. When I undo it, sources are not buildable (fellas like being able to do this). There is no build.xml. Test source should probably be there too. We may get a bit philosophical here, but my own personal view is that a binary download of a release of a project shouldn't need all things needed to build the product - that's what SVN/Git is for. The sources and javadocs are there as reference material for the user of the binary . Even if there was a standalone, in the majority of cases they're there for IDE's to have it added as source material for the artifact for the users project. I can easily change this to have the sources unpacked, in fact the default is to have it unpacked, maybe it's clearer and more generally 'usable' for the new user to see the sources unpacked just waiting for perusal. I would propose we produce 2 tar balls as is mostly standard, and Maven Assembly plugin has a definition right out of the box for a sources-like tar bal: http://maven.apache.org/plugins/maven-assembly-plugin/descriptor-refs.html#src (look at both the 'sources' descriptor and the one below it 'project'. stack may be thinking of 'project'. I'll add both so you can see.
          Hide
          Paul Smith added a comment -

          I'd 'vote' in favour of o.a.hbase as a groupId. If there is plans for HBase to become a top-level project (and it makes sense), it probably should have it's own identity independent of Hadoop.

          You can then still choose to have an artifactid of hbase, I think that's fine for now. Really it's about an artifacts identity. At some point though you will probably (at least I hope you do) consider refactoring out independent layers of Hbase. I think a 'hbase-client' jar for example is a great idea for those apps that are just connecting scanner/update clients. Choosing 'hbase' as an artifactId now doesn't preclude that down the track, I just think it would be better to do it now, because later on if you want to do this right, you(we) should consider a backwards compatibility layer.

          For an example of how to 'migrate' groupId/artifactId definitions, see http://maven.apache.org/pom.html#Relocation.

          But it's probably nice to get it right first time. We over at log4j had a bit of a newbie-maven-fail moment in log4j 1.2.15 with the jms/jmx dependencies, because they were not declared as 'optional' upstream, so people have been constantly adding the exclusions (hbase itself is a victim here.. )

          Show
          Paul Smith added a comment - I'd 'vote' in favour of o.a.hbase as a groupId. If there is plans for HBase to become a top-level project (and it makes sense), it probably should have it's own identity independent of Hadoop. You can then still choose to have an artifactid of hbase, I think that's fine for now. Really it's about an artifacts identity. At some point though you will probably (at least I hope you do) consider refactoring out independent layers of Hbase. I think a 'hbase-client' jar for example is a great idea for those apps that are just connecting scanner/update clients. Choosing 'hbase' as an artifactId now doesn't preclude that down the track, I just think it would be better to do it now, because later on if you want to do this right, you(we) should consider a backwards compatibility layer. For an example of how to 'migrate' groupId/artifactId definitions, see http://maven.apache.org/pom.html#Relocation . But it's probably nice to get it right first time. We over at log4j had a bit of a newbie-maven-fail moment in log4j 1.2.15 with the jms/jmx dependencies, because they were not declared as 'optional' upstream, so people have been constantly adding the exclusions (hbase itself is a victim here.. )
          Hide
          Karthik K added a comment -
          ... you think the above better than my suggestion of groupid org.apache.hadoop and our artifactid of hbase?
          Related, its a good ways of I think but there is talk of a few of the hadoop projects moving up to be top-level apache projects. HBase is probably a good candidate so our default package would become o.a.hbase at some stage in the future. This would seem to argue in favor of your suggestion for a groupid of o.a.h.h Kay Kay?

          I guess - the way it works is , a project owns a groupId and can publish any number of artifacts as it sees fit ( and versions under the same).

          Since o.a.h is essentially owned by the hadoop project and I see HBase as an independent project , as you had mentioned yourself, hence thought it might be useful for us to reserve a groupId unique to HBase.

          say - o.a.h.hbase or o.a.hbase , but wanted to make the distinction between the groupId chosen by HBase and hadoop projects ( mapreduce/ hdfs/ common - etc.), all of which use o.a.h .

          This will give us the flexibility to publish artifacts as we see fit , under it without disturbing the groupId owned by other teams.

          As far as the jira being closed - I believe it was closed saying o.a.hadoop , already exists ( as requested initially). For the sake of independence of the project, it might make sense to reopen with a different groupId like - o.a.h.hbase , or o.a.hbase , better.

          And then we can fit that groupId in this pom.xml .

          Show
          Karthik K added a comment - ... you think the above better than my suggestion of groupid org.apache.hadoop and our artifactid of hbase? Related, its a good ways of I think but there is talk of a few of the hadoop projects moving up to be top-level apache projects. HBase is probably a good candidate so our default package would become o.a.hbase at some stage in the future. This would seem to argue in favor of your suggestion for a groupid of o.a.h.h Kay Kay? I guess - the way it works is , a project owns a groupId and can publish any number of artifacts as it sees fit ( and versions under the same). Since o.a.h is essentially owned by the hadoop project and I see HBase as an independent project , as you had mentioned yourself, hence thought it might be useful for us to reserve a groupId unique to HBase. say - o.a.h.hbase or o.a.hbase , but wanted to make the distinction between the groupId chosen by HBase and hadoop projects ( mapreduce/ hdfs/ common - etc.), all of which use o.a.h . This will give us the flexibility to publish artifacts as we see fit , under it without disturbing the groupId owned by other teams. As far as the jira being closed - I believe it was closed saying o.a.hadoop , already exists ( as requested initially). For the sake of independence of the project, it might make sense to reopen with a different groupId like - o.a.h.hbase , or o.a.hbase , better. And then we can fit that groupId in this pom.xml .
          Hide
          Karthik K added a comment -

          .bq I generated the ant/ivy generated binary with the latest trunk for comparison, oddly it currently dosen't include sources, or test jars. Maybe that's a fallout of the ivy transition?

          Yes. I believe so. Ivy tarball needs a bit of attention.

          Paul / stack - Yes , the 'tar' target had an issue after ivy transition, about how we want to pack the lib in the final distribution. HBASE-2128 tracks it. I believe we did not have a consensus then about how to pack independent jars of core and contrib into the lib directory.

          The final that I submitted puts together everything under ./lib directory for a release.

          Show
          Karthik K added a comment - .bq I generated the ant/ivy generated binary with the latest trunk for comparison, oddly it currently dosen't include sources, or test jars. Maybe that's a fallout of the ivy transition? Yes. I believe so. Ivy tarball needs a bit of attention. Paul / stack - Yes , the 'tar' target had an issue after ivy transition, about how we want to pack the lib in the final distribution. HBASE-2128 tracks it. I believe we did not have a consensus then about how to pack independent jars of core and contrib into the lib directory. The final that I submitted puts together everything under ./lib directory for a release.
          Hide
          stack added a comment -

          @ Kay Kay, regards the below comment over in the infrastructure issue – since closed and I'm not sure what it means when its been closed – on group and artifactid....

          "Stack -
          Would it be useful to have the groupId as - org.apache.hadoop.hbase and then publish artifacts as hbase-core / hbase-client etc. Let me know your thoughts on the same."

          ... you think the above better than my suggestion of groupid org.apache.hadoop and our artifactid of hbase?

          I suppose we'll want to publish other than hbase.jar up to maven. There will be hbase-stargate.war and hbase-transacation.jar etc. hbase.jar should become hbase-core then as you suggest (and as Paul had it originally up in the pom above before I asked him to change it).

          What are the pros/cons you see to using groupid of o.a.h.h rather than o.a.h?

          Related, its a good ways of I think but there is talk of a few of the hadoop projects moving up to be top-level apache projects. HBase is probably a good candidate so our default package would become o.a.hbase at some stage in the future. This would seem to argue in favor of your suggestion for a groupid of o.a.h.h Kay Kay?

          Show
          stack added a comment - @ Kay Kay, regards the below comment over in the infrastructure issue – since closed and I'm not sure what it means when its been closed – on group and artifactid.... "Stack - Would it be useful to have the groupId as - org.apache.hadoop.hbase and then publish artifacts as hbase-core / hbase-client etc. Let me know your thoughts on the same." ... you think the above better than my suggestion of groupid org.apache.hadoop and our artifactid of hbase? I suppose we'll want to publish other than hbase.jar up to maven. There will be hbase-stargate.war and hbase-transacation.jar etc. hbase.jar should become hbase-core then as you suggest (and as Paul had it originally up in the pom above before I asked him to change it). What are the pros/cons you see to using groupid of o.a.h.h rather than o.a.h? Related, its a good ways of I think but there is talk of a few of the hadoop projects moving up to be top-level apache projects. HBase is probably a good candidate so our default package would become o.a.hbase at some stage in the future. This would seem to argue in favor of your suggestion for a groupid of o.a.h.h Kay Kay?
          Hide
          stack added a comment -

          I like the way you have sources bundled up a jar but I think needs a bit more work. When I undo it, sources are not buildable (fellas like being able to do this). There is no build.xml. Test source should probably be there too.

          Contribs look good but need doc carried over?

          In general, I wouldn't sweat the tar ball too much. My guess is that its layout may change some soon enough. We can file issue to address after maven goes in.

          i was able to startup the hbase instance. Thats sweet.

          .bq I generated the ant/ivy generated binary with the latest trunk for comparison, oddly it currently dosen't include sources, or test jars. Maybe that's a fallout of the ivy transition?

          Yes. I believe so. Ivy tarball needs a bit of attention.

          I wouldn't worry about the overpopulated lib dir. We can fix that later.

          Show
          stack added a comment - I like the way you have sources bundled up a jar but I think needs a bit more work. When I undo it, sources are not buildable (fellas like being able to do this). There is no build.xml. Test source should probably be there too. Contribs look good but need doc carried over? In general, I wouldn't sweat the tar ball too much. My guess is that its layout may change some soon enough. We can file issue to address after maven goes in. i was able to startup the hbase instance. Thats sweet. .bq I generated the ant/ivy generated binary with the latest trunk for comparison, oddly it currently dosen't include sources, or test jars. Maybe that's a fallout of the ivy transition? Yes. I believe so. Ivy tarball needs a bit of attention. I wouldn't worry about the overpopulated lib dir. We can fix that later.
          Hide
          Paul Smith added a comment -

          latest patch and updated Plan document:

          • sources for hbase-core now in properly, now minus the '-core' part (although see previous comment about a probable better way to achieve this)
          • adds transactional and ec2 contrib areas to the binary.
          • stargate changed over to a WAR. Looks like this module is trying to be 2 things (both a jar and war file). Would likely prefer to make this a multi-module project so as to produce both the jar, and the war, it's just much simpler that way (the duopoly of it is a bit odd otherwise).

          stack (or whomever) this patch is worth a kick of the tires, but if it doesn't work, the generated tar ball this is creating on my local workspace can be gotten here:

          http://people.apache.org/~psmith/hbase/hbase-0.20.2-SNAPSHOT-bin.tar.gz

          Note, in the new plan, there's a set of steps that attempt to set the svn:ignore. You'll need to add the 'target' directory into svn:ignore one by one, if someone knows a smart & fast way to do this, let me know.

          Show
          Paul Smith added a comment - latest patch and updated Plan document: sources for hbase-core now in properly, now minus the '-core' part (although see previous comment about a probable better way to achieve this) adds transactional and ec2 contrib areas to the binary. stargate changed over to a WAR. Looks like this module is trying to be 2 things (both a jar and war file). Would likely prefer to make this a multi-module project so as to produce both the jar, and the war, it's just much simpler that way (the duopoly of it is a bit odd otherwise). stack (or whomever) this patch is worth a kick of the tires, but if it doesn't work, the generated tar ball this is creating on my local workspace can be gotten here: http://people.apache.org/~psmith/hbase/hbase-0.20.2-SNAPSHOT-bin.tar.gz Note, in the new plan, there's a set of steps that attempt to set the svn:ignore. You'll need to add the 'target' directory into svn:ignore one by one, if someone knows a smart & fast way to do this, let me know.
          Hide
          Paul Smith added a comment -

          Just replying to something stack said way back in response to the generated binary:

          We need to get test jar in there too.

          No problem, just odd, not normally something that is bundled in a binary artifact (I don't think I've every seen that).

          The resultant jar is named hbase-core

          I've got the sources bundled as a jar inside the tar ball, and fixed the 'hbase-core' - 'hbase' jar naming convention. do wonder if in the longer term it would better to instead of change the 'finalName' property in the pom to bypass the artifictId, instead rename the artifactId to just be 'hbase' (so groupId:artifactId being org.apache.hadoop.hbase:hbase). That might make things overal simpler ?

          I generated the ant/ivy generated binary with the latest trunk for comparison, oddly it currently dosen't include sources, or test jars. Maybe that's a fallout of the ivy transition?

          The lib dir is a bit overpopulated methinks but would have to check

          Yeah, compared with the one generated by ivy, it's a bit.. thicker. Is zookeeper all we need in the binary? Easy to do, just odd that with the declared dependencies the transitive ones are coming in too. You guys should know from experience what exactly is needed to run this, so I'm up for suggestions, I'll try to mimic the Ivy-generated format for now since that should be what we're tracking against, it's just that if I look at my local hbase 0.20.1 binary install I grabbed weeks ago for mucking around with there's 24 jars and 2 directories in the lib directory, and that's way more than what is spitting out under ivy right now?

          I'm going to package up the rest of the contrib projects next, that should be straight forward.

          Show
          Paul Smith added a comment - Just replying to something stack said way back in response to the generated binary: We need to get test jar in there too. No problem, just odd, not normally something that is bundled in a binary artifact (I don't think I've every seen that). The resultant jar is named hbase-core I've got the sources bundled as a jar inside the tar ball, and fixed the 'hbase-core' - 'hbase' jar naming convention. do wonder if in the longer term it would better to instead of change the 'finalName' property in the pom to bypass the artifictId, instead rename the artifactId to just be 'hbase' (so groupId:artifactId being org.apache.hadoop.hbase:hbase). That might make things overal simpler ? I generated the ant/ivy generated binary with the latest trunk for comparison, oddly it currently dosen't include sources, or test jars. Maybe that's a fallout of the ivy transition? The lib dir is a bit overpopulated methinks but would have to check Yeah, compared with the one generated by ivy, it's a bit.. thicker. Is zookeeper all we need in the binary? Easy to do, just odd that with the declared dependencies the transitive ones are coming in too. You guys should know from experience what exactly is needed to run this, so I'm up for suggestions, I'll try to mimic the Ivy-generated format for now since that should be what we're tracking against, it's just that if I look at my local hbase 0.20.1 binary install I grabbed weeks ago for mucking around with there's 24 jars and 2 directories in the lib directory, and that's way more than what is spitting out under ivy right now? I'm going to package up the rest of the contrib projects next, that should be straight forward.
          Hide
          stack added a comment -

          Kick me when you want me to try a new patch of yours Paul

          Show
          stack added a comment - Kick me when you want me to try a new patch of yours Paul
          Hide
          Paul Smith added a comment -

          Latest patch and move script based on latest trunk, just wanted to re-sync since I've been out of action with some family health issues (all good now).

          • adjusted to take the new libthrift-0.2.0.jar, that's been deployed to my 'paul special' repository
          • Found missing step to actually 'svn add' the newly patched in poms
          • added svn:ignore entries for Maven's target directory (can't find a simple way to add a line via 'svn propset' non-interactively, probably could do this with a 'svn proplist' via awk, and then add a line, but honestly it's just easier if someone types it in via vi or something.. Lazy! )

          the assembly tar ball is 'working' although I can't quite get the sources to sit properly as a jar, they keep unpacking themselves inside the tar ball, may need an email to the maven-user list, it's not something I've done before, definitely solvable, just not obvious right this second. I'm sure someone will not like the tar ball directory layout, but we can talk.

          Still have the other contrib modules to add, and there's that one test case that fails for some reason, but inching along..

          Show
          Paul Smith added a comment - Latest patch and move script based on latest trunk, just wanted to re-sync since I've been out of action with some family health issues (all good now). adjusted to take the new libthrift-0.2.0.jar, that's been deployed to my 'paul special' repository Found missing step to actually 'svn add' the newly patched in poms added svn:ignore entries for Maven's target directory (can't find a simple way to add a line via 'svn propset' non-interactively, probably could do this with a 'svn proplist' via awk, and then add a line, but honestly it's just easier if someone types it in via vi or something.. Lazy! ) the assembly tar ball is 'working' although I can't quite get the sources to sit properly as a jar, they keep unpacking themselves inside the tar ball, may need an email to the maven-user list, it's not something I've done before, definitely solvable, just not obvious right this second. I'm sure someone will not like the tar ball directory layout, but we can talk. Still have the other contrib modules to add, and there's that one test case that fails for some reason, but inching along..
          Hide
          Karthik K added a comment -

          Important: THRIFT-363 patch contains the version ( in ivy.xml ) to be 0.3.0-yyyyMMdd etc.

          That might need to changed to 0.2.0 as well.

          Show
          Karthik K added a comment - Important : THRIFT-363 patch contains the version ( in ivy.xml ) to be 0.3.0-yyyyMMdd etc. That might need to changed to 0.2.0 as well.
          Hide
          Karthik K added a comment -
          You want us to patch thrift 0.2.0 with thrift-363? Then you want me to set up a mvn repo up on people.apache.org/~stack.

          Sure. If you have the source code of thrift-0.2.0 - you can apply the patch in THRIFT-363 ( the latest one ) and see if you are able to publish the artifacts.

          Caveat:

          + <property name="apache.snapshot.repository" value="https://repository.apache.org/content/repositories/snapshots" />

          This property needs to be overridden / changed to a url that can publish to the home directory (people.apache.org/~username ) .

          Aliter:
          --------
          This patch should install the package in ~/.m2/repository in the local disk and the installed files can be manually copied (ftp-ed) to the path mentioned above.
          (We can go down this route , if figuring out the url to publish to people.apache.org/~username is harder ).

          Show
          Karthik K added a comment - You want us to patch thrift 0.2.0 with thrift-363? Then you want me to set up a mvn repo up on people.apache.org/~stack. Sure. If you have the source code of thrift-0.2.0 - you can apply the patch in THRIFT-363 ( the latest one ) and see if you are able to publish the artifacts. Caveat: + <property name="apache.snapshot.repository" value="https://repository.apache.org/content/repositories/snapshots" /> This property needs to be overridden / changed to a url that can publish to the home directory (people.apache.org/~username ) . Aliter: -------- This patch should install the package in ~/.m2/repository in the local disk and the installed files can be manually copied (ftp-ed) to the path mentioned above. (We can go down this route , if figuring out the url to publish to people.apache.org/~username is harder ).
          Hide
          stack added a comment -

          @Kay Kay You want us to patch thrift 0.2.0 with thrift-363? Then you want me to set up a mvn repo up on people.apache.org/~stack.

          Show
          stack added a comment - @Kay Kay You want us to patch thrift 0.2.0 with thrift-363? Then you want me to set up a mvn repo up on people.apache.org/~stack.
          Hide
          Karthik K added a comment -
          I'm good with that. Could host it in my accunt since closer into hbase.

          Great going - Paul and stack.

          For thrift - THRIFT-363 should have the patch to publish artifacts .

          @stack - can you help give it a try to publish repository to your account. We need to apply the patch in thrift and then

          $ cd lib/java
          $ ant publish

          to publish the artifacts.

          Show
          Karthik K added a comment - I'm good with that. Could host it in my accunt since closer into hbase. Great going - Paul and stack. For thrift - THRIFT-363 should have the patch to publish artifacts . @stack - can you help give it a try to publish repository to your account. We need to apply the patch in thrift and then $ cd lib/java $ ant publish to publish the artifacts.
          Hide
          stack added a comment -

          That was it for me.

          [INFO] Building tar : /Users/stack/checkouts/hbase/trunk/target/hbase-0.20.2-SNAPSHOT-bin.tar.gz
          [INFO] 
          [INFO] 
          [INFO] ------------------------------------------------------------------------
          [INFO] Reactor Summary:
          [INFO] ------------------------------------------------------------------------
          [INFO] HBase ................................................. SUCCESS [1:38.680s]
          [INFO] HBase Core ............................................ SUCCESS [1:03.140s]
          [INFO] HBase Contrib ......................................... SUCCESS [0.005s]
          [INFO] HBase Contrib - Stargate .............................. SUCCESS [37.954s]
          [INFO] ------------------------------------------------------------------------
          [INFO] ------------------------------------------------------------------------
          [INFO] BUILD SUCCESSFUL
          [INFO] ------------------------------------------------------------------------
          [INFO] Total time: 3 minutes 23 seconds
          [INFO] Finished at: Thu Jan 21 12:12:17 PST 2010
          [INFO] Final Memory: 55M/153M
          [INFO] ------------------------------------------------------------------------
          

          The resulting tar ball is pretty dang close. The lib dir is a bit overpopulated methinks but would have to check. The resultant jar is named hbase-core:

          -rwxr-xr-x 1 stack staff 1583331 Jan 21 12:10 hbase-core-0.20.2-SNAPSHOT.jar

          We need to get test jar in there too.

          Hadoop-style is to ship the source all in the one tar ball but maven does bin and src tarballs. Thats fine by me separating them.

          I ran with tests and all output shows on console. We can fix that later?

          Stargate seems to be the only contrib. Is that intentional? Thats fine if it is.

          Whats under the contrib dir needs work but can do that later.

          I tried to build the site and got this:

          Missing:
          ----------
          1) org.apache.hadoop.hbase:hbase-core:jar:0.20.2-SNAPSHOT
          
            Try downloading the file manually from the project website.
          
            Then, install it using the command: 
                mvn install:install-file -DgroupId=org.apache.hadoop.hbase -DartifactId=hbase-core -Dversion=0.20.2-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file
          
            Alternatively, if you host your own repository you can deploy the file there: 
                mvn deploy:deploy-file -DgroupId=org.apache.hadoop.hbase -DartifactId=hbase-core -Dversion=0.20.2-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
          
            Path to dependency: 
                  1) org.apache.hadoop.hbase:hbase-contrib-stargate:jar:0.20.2-SNAPSHOT
                  2) org.apache.hadoop.hbase:hbase-core:jar:0.20.2-SNAPSHOT
          
          ----------
          1 required artifact is missing.
          

          This is looking great Paul.

          Show
          stack added a comment - That was it for me. [INFO] Building tar : /Users/stack/checkouts/hbase/trunk/target/hbase-0.20.2-SNAPSHOT-bin.tar.gz [INFO] [INFO] [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] ------------------------------------------------------------------------ [INFO] HBase ................................................. SUCCESS [1:38.680s] [INFO] HBase Core ............................................ SUCCESS [1:03.140s] [INFO] HBase Contrib ......................................... SUCCESS [0.005s] [INFO] HBase Contrib - Stargate .............................. SUCCESS [37.954s] [INFO] ------------------------------------------------------------------------ [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESSFUL [INFO] ------------------------------------------------------------------------ [INFO] Total time: 3 minutes 23 seconds [INFO] Finished at: Thu Jan 21 12:12:17 PST 2010 [INFO] Final Memory: 55M/153M [INFO] ------------------------------------------------------------------------ The resulting tar ball is pretty dang close. The lib dir is a bit overpopulated methinks but would have to check. The resultant jar is named hbase-core: -rwxr-xr-x 1 stack staff 1583331 Jan 21 12:10 hbase-core-0.20.2-SNAPSHOT.jar We need to get test jar in there too. Hadoop-style is to ship the source all in the one tar ball but maven does bin and src tarballs. Thats fine by me separating them. I ran with tests and all output shows on console. We can fix that later? Stargate seems to be the only contrib. Is that intentional? Thats fine if it is. Whats under the contrib dir needs work but can do that later. I tried to build the site and got this: Missing: ---------- 1) org.apache.hadoop.hbase:hbase-core:jar:0.20.2-SNAPSHOT Try downloading the file manually from the project website. Then, install it using the command: mvn install:install-file -DgroupId=org.apache.hadoop.hbase -DartifactId=hbase-core -Dversion=0.20.2-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file Alternatively, if you host your own repository you can deploy the file there: mvn deploy:deploy-file -DgroupId=org.apache.hadoop.hbase -DartifactId=hbase-core -Dversion=0.20.2-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id] Path to dependency: 1) org.apache.hadoop.hbase:hbase-contrib-stargate:jar:0.20.2-SNAPSHOT 2) org.apache.hadoop.hbase:hbase-core:jar:0.20.2-SNAPSHOT ---------- 1 required artifact is missing. This is looking great Paul.
          Hide
          Paul Smith added a comment -

          From a private email from stack which contained the poms, just removing the '<scope>test</scope>' from the commons-lang dependency fixed this problem (at least for me).

          Show
          Paul Smith added a comment - From a private email from stack which contained the poms, just removing the '<scope>test</scope>' from the commons-lang dependency fixed this problem (at least for me).
          Hide
          Paul Smith added a comment -

          ok, well my MAVEN_OPTS is < 1024, but is greater than default, I was using -Xmx512 because our own corporate packaging requires it.

          I'll note that down in the plan though, I'll be curious to know where it barfs through lack of memory because HBase build system isn't too large.

          Build is failing on me because I just updated us to thrift 0.2.0. I'm putting commons-lang in wrong place. I added version property up in the top-level pom and then added the commons-lang dependency to core/pom.xml but that doesn't seem to be enough for the compile of the thrift generated classes. They fail to compile looking for HashCodeBuilder from commons-lang. What would you suggest?

          well that should work, can you send me the top-level pom and core/pom.xml you are using? where would I get thrift 0.2.0 from? Is this stuff in trunk already or is it in a private work area?

          I'll start drafting a note to hbase-dev anyway.

          Show
          Paul Smith added a comment - ok, well my MAVEN_OPTS is < 1024, but is greater than default, I was using -Xmx512 because our own corporate packaging requires it. I'll note that down in the plan though, I'll be curious to know where it barfs through lack of memory because HBase build system isn't too large. Build is failing on me because I just updated us to thrift 0.2.0. I'm putting commons-lang in wrong place. I added version property up in the top-level pom and then added the commons-lang dependency to core/pom.xml but that doesn't seem to be enough for the compile of the thrift generated classes. They fail to compile looking for HashCodeBuilder from commons-lang. What would you suggest? well that should work, can you send me the top-level pom and core/pom.xml you are using? where would I get thrift 0.2.0 from? Is this stuff in trunk already or is it in a private work area? I'll start drafting a note to hbase-dev anyway.
          Hide
          stack added a comment -

          nm. pilot error (I'd double applied the patch so xml files were doubled).

          So, I got further. Had to set MAVEN_OPTS="-Xmx1024m" else it was OOMEing.

          Build is failing on me because I just updated us to thrift 0.2.0. I'm putting commons-lang in wrong place. I added version property up in the top-level pom and then added the commons-lang dependency to core/pom.xml but that doesn't seem to be enough for the compile of the thrift generated classes. They fail to compile looking for HashCodeBuilder from commons-lang. What would you suggest?

          Paul, I think that I could make this work with your sweet instructions, no problem. Whats missing is the vote on the move to maven. One thing I was thinking was that we could bring up "move to maven" as a topic at next weeks HUG. One of us could talk it up for you but for sure you should start the discussion going up on the list.

          Great stuff Paul.

          Show
          stack added a comment - nm. pilot error (I'd double applied the patch so xml files were doubled). So, I got further. Had to set MAVEN_OPTS="-Xmx1024m" else it was OOMEing. Build is failing on me because I just updated us to thrift 0.2.0. I'm putting commons-lang in wrong place. I added version property up in the top-level pom and then added the commons-lang dependency to core/pom.xml but that doesn't seem to be enough for the compile of the thrift generated classes. They fail to compile looking for HashCodeBuilder from commons-lang. What would you suggest? Paul, I think that I could make this work with your sweet instructions, no problem. Whats missing is the vote on the move to maven. One thing I was thinking was that we could bring up "move to maven" as a topic at next weeks HUG. One of us could talk it up for you but for sure you should start the discussion going up on the list. Great stuff Paul.
          Hide
          stack added a comment -

          Instructions on how to move to maven are high quality. Thanks Paul.

          I get the following after all is said and done:

          Here is my mvn version:

          pynchon:trunk stack$ ~/bin/mvn/bin/mvn clean package assembly:assembly
          [INFO] Scanning for projects...
          [INFO] ------------------------------------------------------------------------
          [ERROR] FATAL ERROR
          [INFO] ------------------------------------------------------------------------
          [INFO] 8479
          [INFO] ------------------------------------------------------------------------
          [INFO] Trace
          java.lang.ArrayIndexOutOfBoundsException: 8479
                  at hidden.org.codehaus.plexus.util.xml.pull.MXParser.parsePI(MXParser.java:2470)
                  at hidden.org.codehaus.plexus.util.xml.pull.MXParser.parseEpilog(MXParser.java:1575)
                  at hidden.org.codehaus.plexus.util.xml.pull.MXParser.nextImpl(MXParser.java:1405)
                  at hidden.org.codehaus.plexus.util.xml.pull.MXParser.next(MXParser.java:1105)
                  at org.apache.maven.model.io.xpp3.MavenXpp3Reader.parseModel(MavenXpp3Reader.java:2133)
                  at org.apache.maven.model.io.xpp3.MavenXpp3Reader.read(MavenXpp3Reader.java:3912)
                  at org.apache.maven.project.DefaultMavenProjectBuilder.readModel(DefaultMavenProjectBuilder.java:1606)
                  at org.apache.maven.project.DefaultMavenProjectBuilder.readModel(DefaultMavenProjectBuilder.java:1571)
                  at org.apache.maven.project.DefaultMavenProjectBuilder.buildFromSourceFileInternal(DefaultMavenProjectBuilder.java:506)
                  at org.apache.maven.project.DefaultMavenProjectBuilder.build(DefaultMavenProjectBuilder.java:200)
                  at org.apache.maven.DefaultMaven.getProject(DefaultMaven.java:604)
                  at org.apache.maven.DefaultMaven.collectProjects(DefaultMaven.java:487)
                  at org.apache.maven.DefaultMaven.collectProjects(DefaultMaven.java:560)
                  at org.apache.maven.DefaultMaven.getProjects(DefaultMaven.java:391)
                  at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:272)
                  at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138)
                  at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)
                  at org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                  at java.lang.reflect.Method.invoke(Method.java:597)
                  at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
                  at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
                  at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
                  at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
          [INFO] ------------------------------------------------------------------------
          [INFO] Total time: < 1 second
          [INFO] Finished at: Wed Jan 20 15:57:42 PST 2010
          [INFO] Final Memory: 3M/79M
          [INFO] ------------------------------------------------------------------------
          
          pynchon:trunk stack$ ~/bin/mvn/bin/mvn -version
          Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700)
          Java version: 1.6.0_15
          Java home: /System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home
          Default locale: en_US, platform encoding: MacRoman
          OS name: "mac os x" version: "10.6" arch: "x86_64" Family: "mac"
          
          Show
          stack added a comment - Instructions on how to move to maven are high quality. Thanks Paul. I get the following after all is said and done: Here is my mvn version: pynchon:trunk stack$ ~/bin/mvn/bin/mvn clean package assembly:assembly [INFO] Scanning for projects... [INFO] ------------------------------------------------------------------------ [ERROR] FATAL ERROR [INFO] ------------------------------------------------------------------------ [INFO] 8479 [INFO] ------------------------------------------------------------------------ [INFO] Trace java.lang.ArrayIndexOutOfBoundsException: 8479 at hidden.org.codehaus.plexus.util.xml.pull.MXParser.parsePI(MXParser.java:2470) at hidden.org.codehaus.plexus.util.xml.pull.MXParser.parseEpilog(MXParser.java:1575) at hidden.org.codehaus.plexus.util.xml.pull.MXParser.nextImpl(MXParser.java:1405) at hidden.org.codehaus.plexus.util.xml.pull.MXParser.next(MXParser.java:1105) at org.apache.maven.model.io.xpp3.MavenXpp3Reader.parseModel(MavenXpp3Reader.java:2133) at org.apache.maven.model.io.xpp3.MavenXpp3Reader.read(MavenXpp3Reader.java:3912) at org.apache.maven.project.DefaultMavenProjectBuilder.readModel(DefaultMavenProjectBuilder.java:1606) at org.apache.maven.project.DefaultMavenProjectBuilder.readModel(DefaultMavenProjectBuilder.java:1571) at org.apache.maven.project.DefaultMavenProjectBuilder.buildFromSourceFileInternal(DefaultMavenProjectBuilder.java:506) at org.apache.maven.project.DefaultMavenProjectBuilder.build(DefaultMavenProjectBuilder.java:200) at org.apache.maven.DefaultMaven.getProject(DefaultMaven.java:604) at org.apache.maven.DefaultMaven.collectProjects(DefaultMaven.java:487) at org.apache.maven.DefaultMaven.collectProjects(DefaultMaven.java:560) at org.apache.maven.DefaultMaven.getProjects(DefaultMaven.java:391) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:272) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138) at org.apache.maven.cli.MavenCli.main(MavenCli.java:362) at org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315) at org.codehaus.classworlds.Launcher.launch(Launcher.java:255) at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430) at org.codehaus.classworlds.Launcher.main(Launcher.java:375) [INFO] ------------------------------------------------------------------------ [INFO] Total time: < 1 second [INFO] Finished at: Wed Jan 20 15:57:42 PST 2010 [INFO] Final Memory: 3M/79M [INFO] ------------------------------------------------------------------------ pynchon:trunk stack$ ~/bin/mvn/bin/mvn -version Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700) Java version: 1.6.0_15 Java home: / System /Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home Default locale: en_US, platform encoding: MacRoman OS name: "mac os x" version: "10.6" arch: "x86_64" Family: "mac"
          Hide
          Paul Smith added a comment -

          incidently the HBASE-2099-7.patch and plan attached in the previous attempt WAS intended for use by the ASF (or thrown out and stomped on, whatever gets people excited)> I had a brain fail there in choosing a tick box, oh my.

          Show
          Paul Smith added a comment - incidently the HBASE-2099 -7.patch and plan attached in the previous attempt WAS intended for use by the ASF (or thrown out and stomped on, whatever gets people excited)> I had a brain fail there in choosing a tick box, oh my.
          Hide
          Paul Smith added a comment -

          Attached is the first actual patch I deem worthy for someone to try out, and if successful, I will then mail the hbase-dev list to elicit further feedback.

          Please:

          1. save the HBASE-2099-7.patch somewhere
          1. Read the HBase Move Script.txt file, it contains some important info at the top if you are an existing Maven user that has a local Repository Manager configured. If you don't, I think it's (strangely) simpler, but otherwise straightforward.
          1. Execute the non-commented out patch, svn, and mvn statements on your console in your clean hbase-trunk checkout.

          crosses fingers

          Show
          Paul Smith added a comment - Attached is the first actual patch I deem worthy for someone to try out, and if successful, I will then mail the hbase-dev list to elicit further feedback. Please: save the HBASE-2099 -7.patch somewhere Read the HBase Move Script.txt file, it contains some important info at the top if you are an existing Maven user that has a local Repository Manager configured. If you don't, I think it's (strangely) simpler, but otherwise straightforward. Execute the non-commented out patch, svn, and mvn statements on your console in your clean hbase-trunk checkout. crosses fingers
          Hide
          Paul Smith added a comment -

          @stack Melbourne

          Show
          Paul Smith added a comment - @stack Melbourne
          Hide
          stack added a comment -

          @Paul Where in Oz?

          Show
          stack added a comment - @Paul Where in Oz?
          Hide
          Paul Smith added a comment -

          @KayKay, all but 2 pass (see attachment to this issue named 'test-reports', I haven't looked at them in detail), see my comment dated 15th Jan starting 'Here's a small zip of the HTML report ...', it shows the summary.

          I'll loop back over these failing tests soon. (there's still the 'error' test, which is a test without any tests, a main-method-only 'test-harness' that either has to be excluded, deleted, or moved out of test, simple either way).

          I'm just testing the repository definitions in maven by removing my local reference to our Nexus repo. Oh man, I forgot how slow it is to download everything from US->Australia(Where I ma).

          our office internet is a bit flakey at the moment too, bummer.

          Show
          Paul Smith added a comment - @KayKay, all but 2 pass (see attachment to this issue named 'test-reports', I haven't looked at them in detail), see my comment dated 15th Jan starting 'Here's a small zip of the HTML report ...', it shows the summary. I'll loop back over these failing tests soon. (there's still the 'error' test, which is a test without any tests, a main-method-only 'test-harness' that either has to be excluded, deleted, or moved out of test, simple either way). I'm just testing the repository definitions in maven by removing my local reference to our Nexus repo. Oh man, I forgot how slow it is to download everything from US->Australia(Where I ma). our office internet is a bit flakey at the moment too, bummer.
          Hide
          Karthik K added a comment -

          @Paul very good progress. Just to make sure - did you get mvn test work as well ?

          Show
          Karthik K added a comment - @Paul very good progress. Just to make sure - did you get mvn test work as well ?
          Hide
          stack added a comment -

          I'm good with that. Could host it in my accunt since closer into hbase.

          Show
          stack added a comment - I'm good with that. Could host it in my accunt since closer into hbase.
          Hide
          Paul Smith added a comment -

          I don't think Maven can pull from a local directory, BUT, what would be 'wrong' with just, say, 'mvn deploy:deploy-file' these 2 to, say, people.apache.org/~psmith/..

          The jars need to come from a repo, there's no requirement that it be a 'flashy' one..

          Now, specifically, why CAN'T the zookeeper final be uploaded to Maven central? That's a 'released' product from an Apache point of view? (or are you using a Snapshot one, and if so, then Apache Public Snapshots is fine too).

          thrift is in incubation, I know an incubation project can't publish official releases, but I don't think that means they can't publish a snapshot. I could look into that if you like, but otherwise just hosting these @ people.apache.org.. is an interim measure.

          Once I have the repositories defined in the pom (lunch time today), I'll write the list, I'd like to get to the point where I would think that anyone could follow my steps to a clean trunk, and it should just build.

          Thanks for everyone's feedback!

          Show
          Paul Smith added a comment - I don't think Maven can pull from a local directory, BUT, what would be 'wrong' with just, say, 'mvn deploy:deploy-file' these 2 to, say, people.apache.org/~psmith/.. The jars need to come from a repo, there's no requirement that it be a 'flashy' one.. Now, specifically, why CAN'T the zookeeper final be uploaded to Maven central? That's a 'released' product from an Apache point of view? (or are you using a Snapshot one, and if so, then Apache Public Snapshots is fine too). thrift is in incubation, I know an incubation project can't publish official releases, but I don't think that means they can't publish a snapshot. I could look into that if you like, but otherwise just hosting these @ people.apache.org.. is an interim measure. Once I have the repositories defined in the pom (lunch time today), I'll write the list, I'd like to get to the point where I would think that anyone could follow my steps to a clean trunk, and it should just build. Thanks for everyone's feedback!
          Hide
          stack added a comment -

          @Paul Excellent. You think it time to write the list now? I think your message to the list should include a pointer to a mavenized hbase site hosted in your personal apache dir as per above. That'll help people understand whats involved.

          Also, regards the jars that are not yet up in maven – as per Lars – thrift and zk, can we tell maven to use the ones checked into the lib dir? (You used to be able to IIRC).

          Show
          stack added a comment - @Paul Excellent. You think it time to write the list now? I think your message to the list should include a pointer to a mavenized hbase site hosted in your personal apache dir as per above. That'll help people understand whats involved. Also, regards the jars that are not yet up in maven – as per Lars – thrift and zk, can we tell maven to use the ones checked into the lib dir? (You used to be able to IIRC).
          Hide
          Lars Francke added a comment -

          Very good work. I'm looking forward to building HBase with Maven.

          One thing though: You've added a dependency for Thrift but that will perhaps never come to a Maven repository.

          Could you have a look at HBASE-1360?
          We basically need three new dependencies for the core for now (at least for 0.21): slf4j-simple 1.5.8, slf4j-api 1.5.8 (I've seen that you added that already for avro) and commons-lang 2.4. And the libthrift-0.2.0.jar file from the lib/ directory, which you mention should be installed to the local repositories (mvn install:file/Nexus) but it would be ideal if Maven could use the jar from the lib directory directly for now without this additional step.

          Show
          Lars Francke added a comment - Very good work. I'm looking forward to building HBase with Maven. One thing though: You've added a dependency for Thrift but that will perhaps never come to a Maven repository. Could you have a look at HBASE-1360 ? We basically need three new dependencies for the core for now (at least for 0.21): slf4j-simple 1.5.8, slf4j-api 1.5.8 (I've seen that you added that already for avro) and commons-lang 2.4. And the libthrift-0.2.0.jar file from the lib/ directory, which you mention should be installed to the local repositories (mvn install:file/Nexus) but it would be ideal if Maven could use the jar from the lib directory directly for now without this additional step.
          Hide
          Paul Smith added a comment -

          This patch now makes decent headway into creating a overarching Hbase project-level binary (that's the src/assembly/bin.xml descriptor).

          Attached:

          1. latest patch, now actually USABLE with 'patch -p0' etc. Not sure why svn diff doesn't output new files correctly, but there you go.
          1. 'plan' script showing the steps to execute to get to this point, which is a poor-mans version of Git I'm sure.
          1. example top-level binary distribution (not including everything yet, but looking close to the current setup)

          Basically:

          1. Take a clean HBase-trunk checkout
          2. execute the steps included in the plan (but change the path to the patch)

          note: I'm positive this will not compile unless you have a local Nexus Maven repository setup, because I haven't specified the snapshot & google-code repo's in the top-level pom as yet, without these specified, and without a local repository manager proxying them, the dependencies won't be found.

          That's next on my list.

          However, if I:

          mvn -Dmaven.test.skip.exec=true package assembly:assembly
          

          I find a tar.gz file in the hbase-trunk/target directory, which if unpacked, and then:

          target/hbase-0.20.2-SNAPSHOT-bin/bin/start-hbase.sh
          

          seems to start Hbase (at least no NoClassDefFoundErrors anyway, I think I have an existing Hbase service running it doesn't like).

          Show
          Paul Smith added a comment - This patch now makes decent headway into creating a overarching Hbase project-level binary (that's the src/assembly/bin.xml descriptor). Attached: latest patch, now actually USABLE with 'patch -p0' etc. Not sure why svn diff doesn't output new files correctly, but there you go. 'plan' script showing the steps to execute to get to this point, which is a poor-mans version of Git I'm sure. example top-level binary distribution (not including everything yet, but looking close to the current setup) Basically: Take a clean HBase-trunk checkout execute the steps included in the plan (but change the path to the patch) note: I'm positive this will not compile unless you have a local Nexus Maven repository setup, because I haven't specified the snapshot & google-code repo's in the top-level pom as yet, without these specified, and without a local repository manager proxying them, the dependencies won't be found. That's next on my list. However, if I: mvn -Dmaven.test.skip.exec=true package assembly:assembly I find a tar.gz file in the hbase-trunk/target directory, which if unpacked, and then: target/hbase-0.20.2-SNAPSHOT-bin/bin/start-hbase.sh seems to start Hbase (at least no NoClassDefFoundErrors anyway, I think I have an existing Hbase service running it doesn't like).
          Hide
          Paul Smith added a comment -

          Just sync'ing (sort of) with what Kay Kay is doing with ivy build, attached current patch containing findbugs reporting plugin, and example report for core and contrib/stargate.

          Show
          Paul Smith added a comment - Just sync'ing (sort of) with what Kay Kay is doing with ivy build, attached current patch containing findbugs reporting plugin, and example report for core and contrib/stargate.
          Hide
          Paul Smith added a comment -

          attached latest 'slim' patch with pom's only, (I can work out the desired directory structure from this, and the full blown patch is just a bit heavy to continue doing for now). Just saving work.

          Show
          Paul Smith added a comment - attached latest 'slim' patch with pom's only, (I can work out the desired directory structure from this, and the full blown patch is just a bit heavy to continue doing for now). Just saving work.
          Hide
          Paul Smith added a comment -

          Here's a small zip of the HTML report from the test run after getting most of this to work.

          This highlights a couple of things on the current test source:

          • org.apache.hadoop.hbase.util.SoftValueSortedMapTest: - this doesn't have any methods to run, so currently comes up as an error under Maven. I can't see in the build.xml where this would be marked as an ignore one, I can add it in to be ignored, but simpler to either move this class out of the test resource tree (even delete it if it's not useful anymore)
          • 2 failures, not sure if they currently happen in trunk or not, perhaps someone more familiar with Hbase can tell me
          testIsDeleted_NotDeleted
          	junit.framework.AssertionFailedError: expected:<false> but was:<true>
          	
          org.apache.hadoop.hbase.regionserver.TestGetDeleteTracker:260
          

          and

          testMakeZKProps
          	junit.framework.AssertionFailedError: expected:<2000> but was:<3000>
          	
          org.apache.hadoop.hbase.zookeeper.HQuorumPeerTest:66
          

          otherwise the test phase looks good, I will now 'assume all is well' with the test cycle under Maven and move on to packaging.

          Show
          Paul Smith added a comment - Here's a small zip of the HTML report from the test run after getting most of this to work. This highlights a couple of things on the current test source: org.apache.hadoop.hbase.util.SoftValueSortedMapTest: - this doesn't have any methods to run, so currently comes up as an error under Maven. I can't see in the build.xml where this would be marked as an ignore one, I can add it in to be ignored, but simpler to either move this class out of the test resource tree (even delete it if it's not useful anymore) 2 failures, not sure if they currently happen in trunk or not, perhaps someone more familiar with Hbase can tell me testIsDeleted_NotDeleted junit.framework.AssertionFailedError: expected:<false> but was:<true> org.apache.hadoop.hbase.regionserver.TestGetDeleteTracker:260 and testMakeZKProps junit.framework.AssertionFailedError: expected:<2000> but was:<3000> org.apache.hadoop.hbase.zookeeper.HQuorumPeerTest:66 otherwise the test phase looks good, I will now 'assume all is well' with the test cycle under Maven and move on to packaging.
          Hide
          Paul Smith added a comment -

          as it turned out I had missed the need for 'fork' and the memory setting in the junit definition in build.xml, adding those same parameters to the Maven surefire plugin got me a lot close.

          I've now only got 1 error, 2 failures, and 4 skipped tests (that's way less than before).

          thanks.

          Show
          Paul Smith added a comment - as it turned out I had missed the need for 'fork' and the memory setting in the junit definition in build.xml, adding those same parameters to the Maven surefire plugin got me a lot close. I've now only got 1 error, 2 failures, and 4 skipped tests (that's way less than before). thanks.
          Hide
          Karthik K added a comment -

          but I can't see from the old ivy build system exactly what extra is needed for the 'hbase-core' tests

          <!-- Test -->
          <dependency org="org.apache.hadoop" name="hadoop-core-test"
          rev="$

          {hadoop-core.version}

          " conf="test->default" transitive="false"/>
          <dependency org="org.apache.hadoop" name="hadoop-hdfs-test"
          rev="$

          {hadoop-hdfs.version}

          " conf="test->default" transitive="false"/>
          <dependency org="org.apache.hadoop" name="hadoop-mapred-test"
          rev="$

          {hadoop-mapred.version}

          " conf="test->default" transitive="false"/>
          <dependency org="org.apache.commons" name="commons-math"
          rev="$

          {commons-math.version}

          " conf="test->default"/>

          These modules are needed for test configuration, in addition to that of core.

          The error seems to be jetty / jetty-util though .

          Do an

          $ ant compile

          from the cmd line - and look at the list of jars at

          $

          {basedir}

          /build/lib/ivy/lib/common and compare it with

          $ mvn dependency:copy-dependencies .

          Show
          Karthik K added a comment - but I can't see from the old ivy build system exactly what extra is needed for the 'hbase-core' tests <!-- Test --> <dependency org="org.apache.hadoop" name="hadoop-core-test" rev="$ {hadoop-core.version} " conf="test->default" transitive="false"/> <dependency org="org.apache.hadoop" name="hadoop-hdfs-test" rev="$ {hadoop-hdfs.version} " conf="test->default" transitive="false"/> <dependency org="org.apache.hadoop" name="hadoop-mapred-test" rev="$ {hadoop-mapred.version} " conf="test->default" transitive="false"/> <dependency org="org.apache.commons" name="commons-math" rev="$ {commons-math.version} " conf="test->default"/> These modules are needed for test configuration, in addition to that of core. The error seems to be jetty / jetty-util though . Do an $ ant compile from the cmd line - and look at the list of jars at $ {basedir} /build/lib/ivy/lib/common and compare it with $ mvn dependency:copy-dependencies .
          Hide
          Paul Smith added a comment -

          Seems like an artifact of contrib/ec2 reusing build-contrib.xml . See HBASE-2126 , looks like a build.xml quirk.

          Hrm, I don't think that's the problem with the Maven side as yet, I'm getting stuff like:

          ....
            <testcase time="1.403" classname="org.apache.hadoop.hbase.TestFullLogReconstruction" name="org.apache.hadoop.hbase.TestFullLogReconstruction">
              <error message="tried to access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class org.slf4j.LoggerFactory" type="java.lang.IllegalAccessError">java.lang.IllegalAccessError: tried to access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class org.slf4j.LoggerFactory
          	at org.slf4j.LoggerFactory.&lt;clinit&gt;(LoggerFactory.java:60)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at org.mortbay.log.Slf4jLog.&lt;init&gt;(Slf4jLog.java:64)
          	at org.mortbay.log.Slf4jLog.&lt;init&gt;(Slf4jLog.java:37)
          ...
          

          and

          ...
            <testcase time="0.18" classname="org.apache.hadoop.hbase.TestZooKeeper" name="org.apache.hadoop.hbase.TestZooKeeper">
              <error message="Could not initialize class org.mortbay.log.Log" type="java.lang.NoClassDefFoundError">java.lang.NoClassDefFoundError: Could not initialize class org.mortbay.log.Log
          	at org.mortbay.component.Container.add(Container.java:200)
          	at org.mortbay.component.Container.update(Container.java:164)
          	at org.mortbay.component.Container.update(Container.java:106)
          	at org.mortbay.jetty.Server.setConnectors(Server.java:158)
          	at org.mortbay.jetty.Server.addConnector(Server.java:132)
          	at org.apache.hadoop.http.HttpServer.&lt;init&gt;(HttpServer.java:119)
          	at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:344
          ...
          

          I suspect there are more dependencies needed to be brought in for testing, but I can't see from the old ivy build system exactly what extra is needed for the 'hbase-core' tests (nothing to do with hbase-contrib as yet).

          But I haven't spent much time on that yet, but if anyone can spot the problem, feel free to let me know...

          Show
          Paul Smith added a comment - Seems like an artifact of contrib/ec2 reusing build-contrib.xml . See HBASE-2126 , looks like a build.xml quirk. Hrm, I don't think that's the problem with the Maven side as yet, I'm getting stuff like: .... <testcase time="1.403" classname="org.apache.hadoop.hbase.TestFullLogReconstruction" name="org.apache.hadoop.hbase.TestFullLogReconstruction"> <error message="tried to access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class org.slf4j.LoggerFactory" type="java.lang.IllegalAccessError">java.lang.IllegalAccessError: tried to access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class org.slf4j.LoggerFactory at org.slf4j.LoggerFactory.&lt;clinit&gt;(LoggerFactory.java:60) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.log.Slf4jLog.&lt;init&gt;(Slf4jLog.java:64) at org.mortbay.log.Slf4jLog.&lt;init&gt;(Slf4jLog.java:37) ... and ... <testcase time="0.18" classname="org.apache.hadoop.hbase.TestZooKeeper" name="org.apache.hadoop.hbase.TestZooKeeper"> <error message="Could not initialize class org.mortbay.log.Log" type="java.lang.NoClassDefFoundError">java.lang.NoClassDefFoundError: Could not initialize class org.mortbay.log.Log at org.mortbay.component.Container.add(Container.java:200) at org.mortbay.component.Container.update(Container.java:164) at org.mortbay.component.Container.update(Container.java:106) at org.mortbay.jetty.Server.setConnectors(Server.java:158) at org.mortbay.jetty.Server.addConnector(Server.java:132) at org.apache.hadoop.http.HttpServer.&lt;init&gt;(HttpServer.java:119) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:344 ... I suspect there are more dependencies needed to be brought in for testing, but I can't see from the old ivy build system exactly what extra is needed for the 'hbase-core' tests (nothing to do with hbase-contrib as yet). But I haven't spent much time on that yet, but if anyone can spot the problem, feel free to let me know...
          Hide
          Karthik K added a comment -

          make the test cases work (I should work out why they also fail under ivy first I suspect)

          Seems like an artifact of contrib/ec2 reusing build-contrib.xml . See HBASE-2126 , looks like a build.xml quirk.

          Show
          Karthik K added a comment - make the test cases work (I should work out why they also fail under ivy first I suspect) Seems like an artifact of contrib/ec2 reusing build-contrib.xml . See HBASE-2126 , looks like a build.xml quirk.
          Hide
          Paul Smith added a comment -

          "Someone entered an issue for 3.3.0 re maven and source in the jar a while back, for 3.3.0 we have a patch to the build.xml that will address (4 jars now, original(current) + bin/src/doc jars for maven repo)"

          Ok, good, so they have a plan for that. It's more an annoyance I guess, there's no functional issue I can think of here, since it's the same source as the binary, it's just being recompiled (well we HOPE it's the same source right.. ? ). The only trickery here is that if it ISN'T the same source, then because hbase is lexicographically earlier than zookeeper, in most JVMs I think it will find the zookeeper classes from the hbase jar, and use those, basically ignoring the binaries inside zookeeper.

          My next steps are in order:

          • use the assembly plugin to build the overall tar ball, mimic what ivy does
          • make the test cases work (I should work out why they also fail under ivy first I suspect)
          • flesh out the contrib area more (I'm currently only build stargate, and probably not completely, because I still haven't added the jruby dependency in just yet for runtime)

          I think if I do the first 2, then I'll post more details on the list.

          Show
          Paul Smith added a comment - "Someone entered an issue for 3.3.0 re maven and source in the jar a while back, for 3.3.0 we have a patch to the build.xml that will address (4 jars now, original(current) + bin/src/doc jars for maven repo)" Ok, good, so they have a plan for that. It's more an annoyance I guess, there's no functional issue I can think of here, since it's the same source as the binary, it's just being recompiled (well we HOPE it's the same source right.. ? ). The only trickery here is that if it ISN'T the same source, then because hbase is lexicographically earlier than zookeeper, in most JVMs I think it will find the zookeeper classes from the hbase jar, and use those, basically ignoring the binaries inside zookeeper. My next steps are in order: use the assembly plugin to build the overall tar ball, mimic what ivy does make the test cases work (I should work out why they also fail under ivy first I suspect) flesh out the contrib area more (I'm currently only build stargate, and probably not completely, because I still haven't added the jruby dependency in just yet for runtime) I think if I do the first 2, then I'll post more details on the list.
          Hide
          stack added a comment -

          On java files in zk jar (from our Patrick – I should have written the zk list instead of writing Patrick direct):

          "We initially copied hadoop, which does that as well. Some of the team members like it as it means you have the source right there if there's an issue (in particular useful if someone patched/compiled themselves).

          "Someone entered an issue for 3.3.0 re maven and source in the jar a while back, for 3.3.0 we have a patch to the build.xml that will address (4 jars now, original(current) + bin/src/doc jars for maven repo)"

          Show
          stack added a comment - On java files in zk jar (from our Patrick – I should have written the zk list instead of writing Patrick direct): "We initially copied hadoop, which does that as well. Some of the team members like it as it means you have the source right there if there's an issue (in particular useful if someone patched/compiled themselves). "Someone entered an issue for 3.3.0 re maven and source in the jar a while back, for 3.3.0 we have a patch to the build.xml that will address (4 jars now, original(current) + bin/src/doc jars for maven repo)"
          Hide
          stack added a comment -

          Thats funny. Now I see why you wanted to wait before you wrote the list. As soon as you show them the above link with all the reports – I always liked xsource plugin... really useful – they are all going to vote +1 on maven.

          Show
          stack added a comment - Thats funny. Now I see why you wanted to wait before you wrote the list. As soon as you show them the above link with all the reports – I always liked xsource plugin... really useful – they are all going to vote +1 on maven.
          Hide
          Paul Smith added a comment -

          I think this is where Maven starts to pay off, just by adding this snippet in the top-level pom:

            <distributionManagement>
              <repository>
                <id>Apache Public Releases</id>
                <url>scp://people.apache.org/home/psmith/public_html/hbase/repo/</url>
              </repository>
              <snapshotRepository>
                <id>Apache Public Snapshots</id>
                <name>Aconex Snapshots</name>
                <url>scp://people.apache.org/home/psmith/public_html/hbase/repo-snapshots/</url>
              </snapshotRepository>
              <site>
                <id>HBase Site</id>
                <url>scp://people.apache.org/home/psmith/public_html/hbase/sandbox/hbase/</url>
              </site>
            </distributionManagement>
          

          and having a corresponding username/password entries in my ~/.m2/settings.xml for each id, I can do this:

          # skip tests because they're currently failing in this interim
          
          mvn -Dmaven.test.skip.exec=true deploy site:deploy
          

          Imagine that the Snapshots url is the real Apache Snapshots location, this is a fast way of providing working snapshot builds for upstream people to use without a full release:

          http://people.apache.org/~psmith/hbase/repo-snapshots/org/apache/hadoop/hbase/hbase-core/0.20.2-SNAPSHOT/

          Also the site generation (putting aside it's ugliness) does produce useful info:

          http://people.apache.org/~psmith/hbase/sandbox/hbase/hbase-core/jdepend-report.html

          A full release can then use the maven-release-plugin, which will manage the SCM tagging, and pushing the final release candidates direct to their proper location on the Apache central release repo.

          Show
          Paul Smith added a comment - I think this is where Maven starts to pay off, just by adding this snippet in the top-level pom: <distributionManagement> <repository> <id>Apache Public Releases</id> <url>scp://people.apache.org/home/psmith/public_html/hbase/repo/</url> </repository> <snapshotRepository> <id>Apache Public Snapshots</id> <name>Aconex Snapshots</name> <url>scp://people.apache.org/home/psmith/public_html/hbase/repo-snapshots/</url> </snapshotRepository> <site> <id>HBase Site</id> <url>scp://people.apache.org/home/psmith/public_html/hbase/sandbox/hbase/</url> </site> </distributionManagement> and having a corresponding username/password entries in my ~/.m2/settings.xml for each id, I can do this: # skip tests because they're currently failing in this interim mvn -Dmaven.test.skip.exec=true deploy site:deploy Imagine that the Snapshots url is the real Apache Snapshots location, this is a fast way of providing working snapshot builds for upstream people to use without a full release: http://people.apache.org/~psmith/hbase/repo-snapshots/org/apache/hadoop/hbase/hbase-core/0.20.2-SNAPSHOT/ Also the site generation (putting aside it's ugliness) does produce useful info: http://people.apache.org/~psmith/hbase/sandbox/hbase/hbase-core/jdepend-report.html A full release can then use the maven-release-plugin, which will manage the SCM tagging, and pushing the final release candidates direct to their proper location on the Apache central release repo.
          Hide
          stack added a comment -

          Thanks for the above Paul.

          Show
          stack added a comment - Thanks for the above Paul.
          Hide
          Paul Smith added a comment -

          Maybe its time to write the list then? The only downside I see is that if we go w/ ivy, we are in closer alignment with the hadoop parent and adjacent projects.

          I'd like to make a little more progress (certainly getting the test cases to work) before I make this more visible. As I said, it's not wasted effort if it is turned down. I'd say that providing more substantial progress may make going to Maven more appealing.

          Another minor downside is the hardwired "website" that maven generates. Its hard to pull around. Currently though, we're in alignment with parent project and we're using forrest which is a dog.

          Maven site generation is not puuurrrty, but it is functional, and the integration of all the reports makes it functional and very useful. Apart from the report generation side of things, the site configuration is not something I have very much experience with (in terms of configuration). Actually how is the HBase site generated at the moment? I can't locate that in the build.xml/ivy setup ? is it external to the hbase-trunk?

          Paul, what you think of the circa-2006 criticisms that maven is a time sink and that it'll make our build harder? (e.g. see commentary on spring moving to maven). Do you think newer maven better?

          2006 is ancient. Really I would have hated to use Maven back then. It has really come to be very stable, and predictable now since 2.2. 2.1 went along way to convincing me it was time to convert our corporate build system to Maven, and it has worked well.

          When I read the ivy site on why ivy and how it compares to maven, I buy neither. The only arg. that makes sense for me is the one where ant remains the build driver if you go w/ ivy. Maven does a load of nice stuff for you.

          Maven does do a lot of nice things. I think many people have had headaches converting their project to Maven, because the older setup of their build probably was 'non-standard' (that's not saying it was bad, just not conventional). When you drift too far from Mavens conventions, you are starting to go against the grain and it requires more energy to do so. As soon I see something being 'hard' in Maven, I start to question whether trying to do it that way is a good idea; perhaps it's simpler to just adjust the project to suit Maven (classic of shuffling directories around to fit the common Maven cases etc).

          Certainly starting off a new Maven project is a breeze, and adding new sub-modules becomes elegant, inheriting all the definitions from the parent, making refactoring/modularizing quite nice.

          I suspect migrating an existing ant build to ivy is way simpler than going to Maven. One has to see the longer term benefits of Maven before it becomes compelling.

          Show
          Paul Smith added a comment - Maybe its time to write the list then? The only downside I see is that if we go w/ ivy, we are in closer alignment with the hadoop parent and adjacent projects. I'd like to make a little more progress (certainly getting the test cases to work) before I make this more visible. As I said, it's not wasted effort if it is turned down. I'd say that providing more substantial progress may make going to Maven more appealing. Another minor downside is the hardwired "website" that maven generates. Its hard to pull around. Currently though, we're in alignment with parent project and we're using forrest which is a dog. Maven site generation is not puuurrrty, but it is functional, and the integration of all the reports makes it functional and very useful. Apart from the report generation side of things, the site configuration is not something I have very much experience with (in terms of configuration). Actually how is the HBase site generated at the moment? I can't locate that in the build.xml/ivy setup ? is it external to the hbase-trunk? Paul, what you think of the circa-2006 criticisms that maven is a time sink and that it'll make our build harder? (e.g. see commentary on spring moving to maven). Do you think newer maven better? 2006 is ancient. Really I would have hated to use Maven back then. It has really come to be very stable, and predictable now since 2.2. 2.1 went along way to convincing me it was time to convert our corporate build system to Maven, and it has worked well. When I read the ivy site on why ivy and how it compares to maven, I buy neither. The only arg. that makes sense for me is the one where ant remains the build driver if you go w/ ivy. Maven does a load of nice stuff for you. Maven does do a lot of nice things. I think many people have had headaches converting their project to Maven, because the older setup of their build probably was 'non-standard' (that's not saying it was bad, just not conventional). When you drift too far from Mavens conventions, you are starting to go against the grain and it requires more energy to do so. As soon I see something being 'hard' in Maven, I start to question whether trying to do it that way is a good idea; perhaps it's simpler to just adjust the project to suit Maven (classic of shuffling directories around to fit the common Maven cases etc). Certainly starting off a new Maven project is a breeze, and adding new sub-modules becomes elegant, inheriting all the definitions from the parent, making refactoring/modularizing quite nice. I suspect migrating an existing ant build to ivy is way simpler than going to Maven. One has to see the longer term benefits of Maven before it becomes compelling.
          Hide
          stack added a comment -

          Maybe its time to write the list then? The only downside I see is that if we go w/ ivy, we are in closer alignment with the hadoop parent and adjacent projects. Another minor downside is the hardwired "website" that maven generates. Its hard to pull around. Currently though, we're in alignment with parent project and we're using forrest which is a dog.

          Paul, what you think of the circa-2006 criticisms that maven is a time sink and that it'll make our build harder? (e.g. see commentary on spring moving to maven). Do you think newer maven better?

          When I read the ivy site on why ivy and how it compares to maven, I buy neither. The only arg. that makes sense for me is the one where ant remains the build driver if you go w/ ivy. Maven does a load of nice stuff for you.

          Show
          stack added a comment - Maybe its time to write the list then? The only downside I see is that if we go w/ ivy, we are in closer alignment with the hadoop parent and adjacent projects. Another minor downside is the hardwired "website" that maven generates. Its hard to pull around. Currently though, we're in alignment with parent project and we're using forrest which is a dog. Paul, what you think of the circa-2006 criticisms that maven is a time sink and that it'll make our build harder? (e.g. see commentary on spring moving to maven). Do you think newer maven better? When I read the ivy site on why ivy and how it compares to maven, I buy neither. The only arg. that makes sense for me is the one where ant remains the build driver if you go w/ ivy. Maven does a load of nice stuff for you.
          Hide
          Paul Smith added a comment -

          WIP so far

          • Webapps shuffled directories around, JSPC done via antrun plugin
          • package-info.java generation done via antrun plugin (slight tweak to the saveVersion.sh to provide the outputDirectory to use, this works around any assumption about a working directory, and allows the outer Maven env to pass in the correct value). build-helper plugin used to then compile this generated source. Slight difference in output, because the $ {project.version}

            is automatically provided to the script by Maven rather than user entered (ie '0.21.0-dev' vs. '0.22-SNAPSHOT')

          Current differences between ivy- and maven-generated jars:

          • conf/hbase-default.xml appears in the jar (current assumption is this is unintended on the ivy part)
          • META-INF/MANIFEST.MF is different (some extra info added by Maven, missing the Main-Class reference, easily done in Maven, just a TODO)
          • org.apache.jute and zookeeper classes added, this is due to unintended source files being located in these artifacts binaries. Still to chase up why Maven compiles these.
          • overview.html appears in the root of the Maven jar. Should that be located in webapps/ ?
          • ivy-generated jar contains 0 byte webapps/rest/META-INF/MANIFEST.MF (as well as one for the static tree). I'm presuming this is unintended by Ivy.
          • A small binary diff on org.apache.hadoop.hbase.HTableDescriptor.class. Not sure what that is, will need to check
          • Hbase.thrift is appearing in the Maven jar. I don't think this is needed in the jar ? Probably simpler to just move this resource into a non-Maven-resource directory.
          Show
          Paul Smith added a comment - WIP so far Webapps shuffled directories around, JSPC done via antrun plugin package-info.java generation done via antrun plugin (slight tweak to the saveVersion.sh to provide the outputDirectory to use, this works around any assumption about a working directory, and allows the outer Maven env to pass in the correct value). build-helper plugin used to then compile this generated source. Slight difference in output, because the $ {project.version} is automatically provided to the script by Maven rather than user entered (ie '0.21.0-dev' vs. '0.22-SNAPSHOT') Current differences between ivy- and maven-generated jars: conf/hbase-default.xml appears in the jar (current assumption is this is unintended on the ivy part) META-INF/MANIFEST.MF is different (some extra info added by Maven, missing the Main-Class reference, easily done in Maven, just a TODO) org.apache.jute and zookeeper classes added, this is due to unintended source files being located in these artifacts binaries. Still to chase up why Maven compiles these. overview.html appears in the root of the Maven jar. Should that be located in webapps/ ? ivy-generated jar contains 0 byte webapps/rest/META-INF/MANIFEST.MF (as well as one for the static tree). I'm presuming this is unintended by Ivy. A small binary diff on org.apache.hadoop.hbase.HTableDescriptor.class. Not sure what that is, will need to check Hbase.thrift is appearing in the Maven jar. I don't think this is needed in the jar ? Probably simpler to just move this resource into a non-Maven-resource directory.
          Hide
          Paul Smith added a comment -

          Thanks @stack. There is no question that if a project has a non-standard need, then using Maven is a bit of the classic square-peg-round-hole scenario.

          But it's surprising how when one looks at builds needs, the commonality is usually there. Yes, with Maven you really do need to give up a little flexibility (one can make an existing structure work with Maven, but it's usually easier to just conform). The advantage to conformity is the principle of standardization; a common build system principle means that more people will understand how to use it, ("Hey, I've used Maven projects before, so this one is easy to follow").

          I haven't spotted anything in the Hbase environment as yet that is really 'non-standard'. You're building a series of JARs, and packaging them up into a tar ball, that's pretty vanilla flavoured as Maven builds go, and exactly what it was targetted for.

          Ivy has a nice sweet spot of being familiar to those with Ant experience, and it does foster some of the commonality Maven strives towards.

          At the end of the day, this must be driven from the hbase-dev community. I'm happy to provide an example working setup based on my experience for consideration, and if it is not needed, it's no biggie.

          Show
          Paul Smith added a comment - Thanks @stack. There is no question that if a project has a non-standard need, then using Maven is a bit of the classic square-peg-round-hole scenario. But it's surprising how when one looks at builds needs, the commonality is usually there. Yes, with Maven you really do need to give up a little flexibility (one can make an existing structure work with Maven, but it's usually easier to just conform). The advantage to conformity is the principle of standardization; a common build system principle means that more people will understand how to use it, ("Hey, I've used Maven projects before, so this one is easy to follow"). I haven't spotted anything in the Hbase environment as yet that is really 'non-standard'. You're building a series of JARs, and packaging them up into a tar ball, that's pretty vanilla flavoured as Maven builds go, and exactly what it was targetted for. Ivy has a nice sweet spot of being familiar to those with Ant experience, and it does foster some of the commonality Maven strives towards. At the end of the day, this must be driven from the hbase-dev community. I'm happy to provide an example working setup based on my experience for consideration, and if it is not needed, it's no biggie.
          Hide
          stack added a comment -

          Here is some discussion from up in hadoop on maven vs ivy: http://www.mail-archive.com/core-dev@hadoop.apache.org/msg27653.html There may have been more but I got tired of searching.

          Show
          stack added a comment - Here is some discussion from up in hadoop on maven vs ivy: http://www.mail-archive.com/core-dev@hadoop.apache.org/msg27653.html There may have been more but I got tired of searching.
          Hide
          Paul Smith added a comment -

          I'll change the groupId's to match @stack's comment. I'll keep the artifactId for core as 'hbase-core' but I'll make sure the final name jar is just 'hbase-$

          {project.version}

          .jar' to keep the consistency.

          Show
          Paul Smith added a comment - I'll change the groupId's to match @stack's comment. I'll keep the artifactId for core as 'hbase-core' but I'll make sure the final name jar is just 'hbase-$ {project.version} .jar' to keep the consistency.
          Hide
          Paul Smith added a comment -

          This sounds like a good idea. Working in a branch. github is good for this (Its hard giving a non-committer access to apache svn repo).

          although this will account for squat I'm sure, but I am an Apache committer (psmith@apache.org, on the Logging Services projects). That makes it less administratively burdensome, but I can appreciate a project team doesn't hand out privileges in the Corn Flakes packets, so I'm perfectly happy to provide patches/instructions and help for now.

          I could probably get myself git-enabled (yet-another-excuse to do that, I really must) but it may just slow me down in the short term. I'll create a poor-mans-branch, and just checkout a fresh copy of trunk now and then and reapply the changes for testing. Once I'm confident it's solid, I'll just post the instructions here for someone with privs to try out and they can commit for review by other hbase-devs

          Show
          Paul Smith added a comment - This sounds like a good idea. Working in a branch. github is good for this (Its hard giving a non-committer access to apache svn repo). although this will account for squat I'm sure, but I am an Apache committer (psmith@apache.org, on the Logging Services projects). That makes it less administratively burdensome, but I can appreciate a project team doesn't hand out privileges in the Corn Flakes packets, so I'm perfectly happy to provide patches/instructions and help for now. I could probably get myself git-enabled (yet-another-excuse to do that, I really must) but it may just slow me down in the short term. I'll create a poor-mans-branch, and just checkout a fresh copy of trunk now and then and reapply the changes for testing. Once I'm confident it's solid, I'll just post the instructions here for someone with privs to try out and they can commit for review by other hbase-devs
          Hide
          stack added a comment -

          .bq What I can probably do is do exactly what I did for our corporate mavenization. I took a branch off trunk, and built a sequence of steps to make the Maven switch, with a good bit of testing (lots of jar diffs).

          This sounds like a good idea. Working in a branch. github is good for this (Its hard giving a non-committer access to apache svn repo).

          Show
          stack added a comment - .bq What I can probably do is do exactly what I did for our corporate mavenization. I took a branch off trunk, and built a sequence of steps to make the Maven switch, with a good bit of testing (lots of jar diffs). This sounds like a good idea. Working in a branch. github is good for this (Its hard giving a non-committer access to apache svn repo).
          Hide
          stack added a comment -

          I'm good w/ Kay Kays suggestion of two phase commit.

          @Paul On svn moving, it'll leave notes in the patch as to what moved there... otherwise, you can do a rough description and I can move the stuff around.

          You know, you're getting pretty far along here Paul. It might be time to solicit community opinion on move to maven. Its a big change. There should be some discussion up on hbase-dev. Maybe hack some more and when you are confident its going to work and you have a good idea of the shape it will take, lets work on posting a note.

          Good stuff.

          Show
          stack added a comment - I'm good w/ Kay Kays suggestion of two phase commit. @Paul On svn moving, it'll leave notes in the patch as to what moved there... otherwise, you can do a rough description and I can move the stuff around. You know, you're getting pretty far along here Paul. It might be time to solicit community opinion on move to maven. Its a big change. There should be some discussion up on hbase-dev. Maybe hack some more and when you are confident its going to work and you have a good idea of the shape it will take, lets work on posting a note. Good stuff.
          Hide
          Paul Smith added a comment -

          I can't think of a seamless way of doing it that doesn't involve either keeping Ivy configs in sync, or having a bit of flexibility (ie. downtime) while the reshuffling is is done.

          What I can probably do is do exactly what I did for our corporate mavenization. I took a branch off trunk, and built a sequence of steps to make the Maven switch, with a good bit of testing (lots of jar diffs).

          Once complete, we held off commits to trunk for a short while while the migration script was applied (in my case I did it manually, step by step, was only about a dozen or so commands), then commit.

          So perhaps we can do that, have a working test branch that someone can follow my steps locally for review and then once you guys are happy, you could run these steps yourself on trunk.

          i don't know how much work keeping ivy in sync would be, risking some brittleness in trunk. How many dev build trunk ? I'm not sure how big the hbase-dev team is, if it's small, that maybe ok, but if it's big it could be a bit of an inconvenience.

          Show
          Paul Smith added a comment - I can't think of a seamless way of doing it that doesn't involve either keeping Ivy configs in sync, or having a bit of flexibility (ie. downtime) while the reshuffling is is done. What I can probably do is do exactly what I did for our corporate mavenization. I took a branch off trunk, and built a sequence of steps to make the Maven switch, with a good bit of testing (lots of jar diffs). Once complete, we held off commits to trunk for a short while while the migration script was applied (in my case I did it manually, step by step, was only about a dozen or so commands), then commit. So perhaps we can do that, have a working test branch that someone can follow my steps locally for review and then once you guys are happy, you could run these steps yourself on trunk. i don't know how much work keeping ivy in sync would be, risking some brittleness in trunk. How many dev build trunk ? I'm not sure how big the hbase-dev team is, if it's small, that maybe ok, but if it's big it could be a bit of an inconvenience.
          Hide
          stack added a comment -

          Looking at the lightweight patch:

          + Our groupid is org.apache.hbase. I think that should be org.apache.hadoop.hbase. Will this groupid put us into the right relative location in the apache maven repo? Here's apache snapshots: https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/. We should be going into this directory I'd say.
          + The artifactid for core is hbase-core. Currently our hbase jar is named hbase-X.X.X.jar. if artifactId is hbase-core, doesn't that mean the core jar will be named hbase-core-X.X.X.jar. Can we keep our old name? Maybe the parent artifactId could be hadoop-hbase and then this core module's could be plain hbase?
          + On these:

          +    <zookeeper.version>3.2.2</zookeeper.version>
          +    <thrift.version>r771587</thrift.version>
          

          Can't we just check these in? I remember that in maven for certain dependencies, you could override the pull from a remote repository and instead have it read from a local directory. Is that still so? (I might be remembering this wrong).

          + We can add in licensing and better project description stuff later, no worries.

          Show
          stack added a comment - Looking at the lightweight patch: + Our groupid is org.apache.hbase. I think that should be org.apache.hadoop.hbase. Will this groupid put us into the right relative location in the apache maven repo? Here's apache snapshots: https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/ . We should be going into this directory I'd say. + The artifactid for core is hbase-core. Currently our hbase jar is named hbase-X.X.X.jar. if artifactId is hbase-core, doesn't that mean the core jar will be named hbase-core-X.X.X.jar. Can we keep our old name? Maybe the parent artifactId could be hadoop-hbase and then this core module's could be plain hbase? + On these: + <zookeeper.version>3.2.2</zookeeper.version> + <thrift.version>r771587</thrift.version> Can't we just check these in? I remember that in maven for certain dependencies, you could override the pull from a remote repository and instead have it read from a local directory. Is that still so? (I might be remembering this wrong). + We can add in licensing and better project description stuff later, no worries.
          Hide
          Karthik K added a comment -
          .bq ..s someone ok to handle any ivy changes as we make progress?

          You think we should have ivy and maven at same time? Once maven is working, we'd ditch ivy?

          May be the first cut of changes can be the directory restructure of the existing files , as fitting maven.

          Change Ivy / build.xml to make sure the src tree changes go with them to continue to be useful for existing process.

          Maintain pom.xml(s) as another patch here until it works completely, and commit it when working. Get away from Ivy then.

          Show
          Karthik K added a comment - .bq ..s someone ok to handle any ivy changes as we make progress? You think we should have ivy and maven at same time? Once maven is working, we'd ditch ivy? May be the first cut of changes can be the directory restructure of the existing files , as fitting maven. Change Ivy / build.xml to make sure the src tree changes go with them to continue to be useful for existing process. Maintain pom.xml(s) as another patch here until it works completely, and commit it when working. Get away from Ivy then.
          Hide
          stack added a comment -

          .bq yep, it appears to 'find' these as part of it's java source location mechanism. It's odd that the zookeeper binary artifact has the .java files embedded with it (go check, it's there), but Maven shouldn't really be doing that if it's not within the defined compilation directories.

          Let me ask zk boys.

          .bq ..s someone ok to handle any ivy changes as we make progress?

          You think we should have ivy and maven at same time? Once maven is working, we'd ditch ivy?

          Show
          stack added a comment - .bq yep, it appears to 'find' these as part of it's java source location mechanism. It's odd that the zookeeper binary artifact has the .java files embedded with it (go check, it's there), but Maven shouldn't really be doing that if it's not within the defined compilation directories. Let me ask zk boys. .bq ..s someone ok to handle any ivy changes as we make progress? You think we should have ivy and maven at same time? Once maven is working, we'd ditch ivy?
          Hide
          Paul Smith added a comment -

          Latest version of the 'patch'. I've included 2 forms, a full, and a 'lite' version that contains only the pom files (makes it easier to review).

          At this stage, the major surgery includes only:

          1. creation of a new core directory (plus pom, with external references)
          2. relocating src/contrib to be a top-level tree.

          I now move on to larger scale svn mv hackery, I'm not 100% sure how this will come out in svn diff form, I may have to just produce a document with all the commands necessary to accomplish it (I had found in a previous large scale maven migration that doing an svn mv, but with people checking in new and modified files into the old directory BEFORE the mv is committed ensures major headaches under svn).

          I hadn't thought of the JDK vs JRE point of view before. There's a maven-jspc-plugin, so we should be able to accomplish the same thing here no problems.

          Show
          Paul Smith added a comment - Latest version of the 'patch'. I've included 2 forms, a full, and a 'lite' version that contains only the pom files (makes it easier to review). At this stage, the major surgery includes only: creation of a new core directory (plus pom, with external references) relocating src/contrib to be a top-level tree. I now move on to larger scale svn mv hackery, I'm not 100% sure how this will come out in svn diff form, I may have to just produce a document with all the commands necessary to accomplish it (I had found in a previous large scale maven migration that doing an svn mv, but with people checking in new and modified files into the old directory BEFORE the mv is committed ensures major headaches under svn). I hadn't thought of the JDK vs JRE point of view before. There's a maven-jspc-plugin, so we should be able to accomplish the same thing here no problems.
          Hide
          Paul Smith added a comment -

          That the generated jar, hbase-0.21.0-dev.jar, has a hbase-default.xml in top-level and in a conf subdir looks wrong. I think the top-level one is the one we want. Whatever, its not your prob. Just make an issue on it and one of us will figure it.

          ok, that's fine, I figured it was the top-level one, I'll just make it only do that one

          Its finding java files in the zk jar and compiling them?

          yep, it appears to 'find' these as part of it's java source location mechanism. It's odd that the zookeeper binary artifact has the .java files embedded with it (go check, it's there), but Maven shouldn't really be doing that if it's not within the defined compilation directories.

          On moving dirs, thats no problem (Can't make an omelette...)

          Mine might be a spicy omelette, is someone ok to handle any ivy changes as we make progress?

          Show
          Paul Smith added a comment - That the generated jar, hbase-0.21.0-dev.jar, has a hbase-default.xml in top-level and in a conf subdir looks wrong. I think the top-level one is the one we want. Whatever, its not your prob. Just make an issue on it and one of us will figure it. ok, that's fine, I figured it was the top-level one, I'll just make it only do that one Its finding java files in the zk jar and compiling them? yep, it appears to 'find' these as part of it's java source location mechanism. It's odd that the zookeeper binary artifact has the .java files embedded with it (go check, it's there), but Maven shouldn't really be doing that if it's not within the defined compilation directories. On moving dirs, thats no problem (Can't make an omelette...) Mine might be a spicy omelette, is someone ok to handle any ivy changes as we make progress?
          Hide
          stack added a comment -

          Here's a few comments Paul (Thanks for taking this on).

          That the generated jar, hbase-0.21.0-dev.jar, has a hbase-default.xml in top-level and in a conf subdir looks wrong. I think the top-level one is the one we want. Whatever, its not your prob. Just make an issue on it and one of us will figure it.

          .bq Maven appears to locate some .java files from within the zookeeper dependency...

          Its finding java files in the zk jar and compiling them?

          .bq The ivy-generated jar contains pre-compiled JSP file...

          Yeah. HBase and Hadoop do this. I think its so hadoop/hbase can run on a jre, that they don't require a jdk.

          .bq cidentally the saveVersion.sh could probably be replaced with a text replace during a Maven build cycle...

          Sure.

          On moving dirs, thats no problem (Can't make an omelette...)

          Show
          stack added a comment - Here's a few comments Paul (Thanks for taking this on). That the generated jar, hbase-0.21.0-dev.jar, has a hbase-default.xml in top-level and in a conf subdir looks wrong. I think the top-level one is the one we want. Whatever, its not your prob. Just make an issue on it and one of us will figure it. .bq Maven appears to locate some .java files from within the zookeeper dependency... Its finding java files in the zk jar and compiling them? .bq The ivy-generated jar contains pre-compiled JSP file... Yeah. HBase and Hadoop do this. I think its so hadoop/hbase can run on a jre, that they don't require a jdk. .bq cidentally the saveVersion.sh could probably be replaced with a text replace during a Maven build cycle... Sure. On moving dirs, thats no problem (Can't make an omelette...)
          Hide
          Paul Smith added a comment -

          ok, I'm making more progress, I'm getting test failures, but then I think the same tests fail when running through ivy, so perhaps a local environment thing, for now I'm focusing on making the jars generated by ivy and by maven to be identical. I do this by unpacking the ivy & maven generated ones into 2 different directories and then running the Mac's opendiff tool to look for differences.

          Heres some bits I've discovered so far.

          • the ivy generated targets appear to 'fatten up' the jar with duplicate files in different directories. For example hbase-default.xml:
          bash-3.2$ unzip -l ../build/hbase-0.21.0-dev.jar | fgrep hbase-default.xml
              21472  01-12-10 11:04   hbase-default.xml
              21472  01-12-10 11:04   conf/hbase-default.xml
          

          I can probably replicate that behaviour but it will make the Maven pom more messy.

          • Maven appears to locate some .java files from within the zookeeper dependency and decides to compile these to the hbase class output. That's annoying, I haven't spotted this one before, I'll raise this with the Maven crew.
          • The ivy-generated jar contains pre-compiled JSP files:
          bash-3.2$ unzip -l ../build/hbase-0.21.0-dev.jar | fgrep generated | fgrep -v thrift
                  0  01-12-10 13:17   org/apache/hadoop/hbase/generated/
                  0  01-12-10 13:17   org/apache/hadoop/hbase/generated/master/
                  0  01-12-10 13:17   org/apache/hadoop/hbase/generated/regionserver/
              12646  01-12-10 13:17   org/apache/hadoop/hbase/generated/master/master_jsp.class
              13018  01-12-10 13:17   org/apache/hadoop/hbase/generated/master/table_jsp.class
               4857  01-12-10 13:17   org/apache/hadoop/hbase/generated/master/zk_jsp.class
               8833  01-12-10 13:17   org/apache/hadoop/hbase/generated/regionserver/regionserver_jsp.class
          

          I know Hbase uses Jetty for it's runtime web -apps and I can see in build.xml the jsp compilation, but whether the raw JSP's should be compiled and bundled isn't clear (for the small # of them it doesn't really seem high value, simpler for Jetty to compile dynamically at boot time?). Seems to overcomplicate the build process?

          I suspect this is just going to be much cleaner with some directory reshuffling.

          i will keep pottering.

          Show
          Paul Smith added a comment - ok, I'm making more progress, I'm getting test failures, but then I think the same tests fail when running through ivy, so perhaps a local environment thing, for now I'm focusing on making the jars generated by ivy and by maven to be identical. I do this by unpacking the ivy & maven generated ones into 2 different directories and then running the Mac's opendiff tool to look for differences. Heres some bits I've discovered so far. the ivy generated targets appear to 'fatten up' the jar with duplicate files in different directories. For example hbase-default.xml: bash-3.2$ unzip -l ../build/hbase-0.21.0-dev.jar | fgrep hbase-default.xml 21472 01-12-10 11:04 hbase-default.xml 21472 01-12-10 11:04 conf/hbase-default.xml I can probably replicate that behaviour but it will make the Maven pom more messy. Maven appears to locate some .java files from within the zookeeper dependency and decides to compile these to the hbase class output. That's annoying, I haven't spotted this one before, I'll raise this with the Maven crew. The ivy-generated jar contains pre-compiled JSP files: bash-3.2$ unzip -l ../build/hbase-0.21.0-dev.jar | fgrep generated | fgrep -v thrift 0 01-12-10 13:17 org/apache/hadoop/hbase/generated/ 0 01-12-10 13:17 org/apache/hadoop/hbase/generated/master/ 0 01-12-10 13:17 org/apache/hadoop/hbase/generated/regionserver/ 12646 01-12-10 13:17 org/apache/hadoop/hbase/generated/master/master_jsp.class 13018 01-12-10 13:17 org/apache/hadoop/hbase/generated/master/table_jsp.class 4857 01-12-10 13:17 org/apache/hadoop/hbase/generated/master/zk_jsp.class 8833 01-12-10 13:17 org/apache/hadoop/hbase/generated/regionserver/regionserver_jsp.class I know Hbase uses Jetty for it's runtime web -apps and I can see in build.xml the jsp compilation, but whether the raw JSP's should be compiled and bundled isn't clear (for the small # of them it doesn't really seem high value, simpler for Jetty to compile dynamically at boot time?). Seems to overcomplicate the build process? I suspect this is just going to be much cleaner with some directory reshuffling. i will keep pottering.
          Hide
          Paul Smith added a comment -

          incidentally the saveVersion.sh could probably be replaced with a text replace during a Maven build cycle (during the generate-sources phase), using the $

          {project.version}

          variable.

          Show
          Paul Smith added a comment - incidentally the saveVersion.sh could probably be replaced with a text replace during a Maven build cycle (during the generate-sources phase), using the $ {project.version} variable.
          Hide
          Paul Smith added a comment -

          the more I look into this the more I think I could hack up a Maven structure that 'works' following the same pattern I had done for core, which is to create a dummy directory containing the pom. But that sort of smells.

          What I'd prefer to be done would be this

          1. Move src/contrib -> contrib, have a top level contrib/pom.xml that makes this tree a multi-module project
          2. Move src/java -> core/src/java
          3. Move src/test -> core/src/test (probably extract out of this a core/src/resources too, but may be possible to configure the resource plugin to filter out .java and use the same directory, not ideal though)
          4. Move src/test/data -> core/test/resources

          The src/webapps directory is interesting, I'm not exactly sure where that is supposed to be bundled up. But I can definitely see how we can use the assembly plugin and descriptors to create the overarching tar ball for release purposes to mimic the current distribution.

          This sort of proposal is obviously far more invasive. If there was a wish to have the Ivy setup in place in the interim, that would have to be modified to keep track.

          What about I just go and fiddle with the current setup and just show it in a fully working setup and upload it here as an example.

          Show
          Paul Smith added a comment - the more I look into this the more I think I could hack up a Maven structure that 'works' following the same pattern I had done for core, which is to create a dummy directory containing the pom. But that sort of smells. What I'd prefer to be done would be this Move src/contrib -> contrib, have a top level contrib/pom.xml that makes this tree a multi-module project Move src/java -> core/src/java Move src/test -> core/src/test (probably extract out of this a core/src/resources too, but may be possible to configure the resource plugin to filter out .java and use the same directory, not ideal though) Move src/test/data -> core/test/resources The src/webapps directory is interesting, I'm not exactly sure where that is supposed to be bundled up. But I can definitely see how we can use the assembly plugin and descriptors to create the overarching tar ball for release purposes to mimic the current distribution. This sort of proposal is obviously far more invasive. If there was a wish to have the Ivy setup in place in the interim, that would have to be modified to keep track. What about I just go and fiddle with the current setup and just show it in a fully working setup and upload it here as an example.
          Hide
          Paul Smith added a comment -

          yep, I've spotted those will add them as I progressively drill deeper into the sub-modules, I'm sure the test side is going to need to bring in more at this point.

          Show
          Paul Smith added a comment - yep, I've spotted those will add them as I progressively drill deeper into the sub-modules, I'm sure the test side is going to need to bring in more at this point.
          Hide
          Karthik K added a comment -

          Great - Paul. See also HBASE-2114 / HBASE-2115 that added a couple of dependencies as well ( jruby-complete / log4j ) to 'core' .

          Show
          Karthik K added a comment - Great - Paul. See also HBASE-2114 / HBASE-2115 that added a couple of dependencies as well ( jruby-complete / log4j ) to 'core' .
          Hide
          Paul Smith added a comment -

          Here's my first cut at this, currently compiles the core fine, doesn't cater for the contrib modules yet.

          As said previously, I've just simply adding a new 'core' directory to contain the pom to fake Maven into thinking it a sub-module, with the core/pom.xml referencing the java src via '..'.

          Show
          Paul Smith added a comment - Here's my first cut at this, currently compiles the core fine, doesn't cater for the contrib modules yet. As said previously, I've just simply adding a new 'core' directory to contain the pom to fake Maven into thinking it a sub-module, with the core/pom.xml referencing the java src via '..'.
          Hide
          Paul Smith added a comment -

          ahh... the old "why don't you look in the directory named 'lib' response" coming I'm sure...

          ahem

          Show
          Paul Smith added a comment - ahh... the old "why don't you look in the directory named 'lib' response" coming I'm sure... ahem
          Hide
          Paul Smith added a comment -

          i managed to make some progress here by being.... crafty (ok it's a hack, lets be honest).

          What I've started down the path of doing is simply adding a 'core' subdirectory under the main hbase trunk check out. inside that directory is just a pom. using judicious references to "$

          {basedir}

          ../src/java" etc I can fake out Maven into thinking that the core is a true sub-module.

          Still, my main headache is finding a binary for thrift, can't spot that anywhere, and indeed the ivy.xml currently has a commented out block for zookeeper and ivy (i've got a local zookeeper binary in our corporate Nexus maven repo so I'm just leveraging off that one for now ).

          I suspect we would also need to split out a resource directory, currently the following files under src/java wouldn't end up in the final jar without some more directory moving or trickery:

          bash-3.2$ find src/java/ -type f | fgrep -v '.java' | fgrep -v '.svn'
          src/java//org/apache/hadoop/hbase/io/hfile/package.html
          src/java//org/apache/hadoop/hbase/ipc/package.html
          src/java//org/apache/hadoop/hbase/mapreduce/RowCounter_Counters.properties
          src/java//org/apache/hadoop/hbase/thrift/Hbase.thrift
          src/java//org/apache/hadoop/hbase/thrift/package.html
          src/java//overview.html
          

          If we could move these to say, src/resources, then it's pretty simple, just tell Maven that's the resource directory and it gets added to the jar.

          I'll keep working on it, but I feel fairly close to the core part being done. After that, I'll make sure the contrib compiles and packages nicely, then I'll need to cycle back and make sure the test cycle is working ok.

          Show
          Paul Smith added a comment - i managed to make some progress here by being.... crafty (ok it's a hack, lets be honest). What I've started down the path of doing is simply adding a 'core' subdirectory under the main hbase trunk check out. inside that directory is just a pom. using judicious references to "$ {basedir} ../src/java" etc I can fake out Maven into thinking that the core is a true sub-module. Still, my main headache is finding a binary for thrift, can't spot that anywhere, and indeed the ivy.xml currently has a commented out block for zookeeper and ivy (i've got a local zookeeper binary in our corporate Nexus maven repo so I'm just leveraging off that one for now ). I suspect we would also need to split out a resource directory, currently the following files under src/java wouldn't end up in the final jar without some more directory moving or trickery: bash-3.2$ find src/java/ -type f | fgrep -v '.java' | fgrep -v '.svn' src/java//org/apache/hadoop/hbase/io/hfile/package.html src/java//org/apache/hadoop/hbase/ipc/package.html src/java//org/apache/hadoop/hbase/mapreduce/RowCounter_Counters.properties src/java//org/apache/hadoop/hbase/thrift/Hbase.thrift src/java//org/apache/hadoop/hbase/thrift/package.html src/java//overview.html If we could move these to say, src/resources, then it's pretty simple, just tell Maven that's the resource directory and it gets added to the jar. I'll keep working on it, but I feel fairly close to the core part being done. After that, I'll make sure the contrib compiles and packages nicely, then I'll need to cycle back and make sure the test cycle is working ok.
          Hide
          Paul Smith added a comment -

          I'm going to have to refresh my memory of the directory layout of contrib, it may be that a smaller directory shuffle will make it more amenable to it, but it's definitely doable, at Aconex we have 5 sub-modules.

          I think if src/java was moved to 'core/src/java', then you'd have 'core' as a sub-module, and contribs as a set of smaller sub-modules.

          If I have time, what I could do is use 'script' to record a session of my hackery, doing some pretend 'svn mv' around etc, and document pom details that work, and then upload a zip'd example of the hbase-trunk check out.

          The fact hbase has a small list of dependencies for core is encouraging, and the fact there's only 3 contrib modules.

          setting up all the maven reports for CPD, PMD, etc is really simple, it's dropping in a bunch of plugin definitions of which I have many examples lying around so I think I have a good chance of success here.

          i'm intrigued now, lets see what a lunch time session gets me.

          Show
          Paul Smith added a comment - I'm going to have to refresh my memory of the directory layout of contrib, it may be that a smaller directory shuffle will make it more amenable to it, but it's definitely doable, at Aconex we have 5 sub-modules. I think if src/java was moved to 'core/src/java', then you'd have 'core' as a sub-module, and contribs as a set of smaller sub-modules. If I have time, what I could do is use 'script' to record a session of my hackery, doing some pretend 'svn mv' around etc, and document pom details that work, and then upload a zip'd example of the hbase-trunk check out. The fact hbase has a small list of dependencies for core is encouraging, and the fact there's only 3 contrib modules. setting up all the maven reports for CPD, PMD, etc is really simple, it's dropping in a bunch of plugin definitions of which I have many examples lying around so I think I have a good chance of success here. i'm intrigued now, lets see what a lunch time session gets me.
          Hide
          stack added a comment -

          @Paul Smith Can you at least comment on how we might do src/contrib sub builds using maven? My (old) understanding is that to do our current hierarchy – i.e. hbase.jar and then src/contrib jars, hbase-stargate.jar, etc. – is possible but messy in maven. I did it previously but it was an ugly hack, something I'd not like to repeat. Thanks (And thanks for the almost volunteering (smile)).

          Show
          stack added a comment - @Paul Smith Can you at least comment on how we might do src/contrib sub builds using maven? My (old) understanding is that to do our current hierarchy – i.e. hbase.jar and then src/contrib jars, hbase-stargate.jar, etc. – is possible but messy in maven. I did it previously but it was an ugly hack, something I'd not like to repeat. Thanks (And thanks for the almost volunteering (smile)).
          Hide
          Paul Smith added a comment -

          I sort of itched and nearly put my hand up to have a crack at this because I'm a bit of a Maven fan and have spent the last 18 months supporting the migration of a large and complex project into a series of sub-modules under Maven, so that experience would be useful here.

          The question is of course time. The recent switch to ivy in some ways makes this simpler because the artifact/version ids of the dependencies are now specified in a fairly maven friendly way.

          Still, time is the problem here, I still haven't made further progress on the hbase-utils so perhaps I should get that to Google code before I try and start something else.

          I'll certainly be lurking and can help review proposals of Maven poms and structures. There should be a way to specify a Maven pom that keeps the existing directory structures to minimize pain (albeit with a more complex pom, but a later directory shuffle after can still happen if needed).

          Show
          Paul Smith added a comment - I sort of itched and nearly put my hand up to have a crack at this because I'm a bit of a Maven fan and have spent the last 18 months supporting the migration of a large and complex project into a series of sub-modules under Maven, so that experience would be useful here. The question is of course time. The recent switch to ivy in some ways makes this simpler because the artifact/version ids of the dependencies are now specified in a fairly maven friendly way. Still, time is the problem here, I still haven't made further progress on the hbase-utils so perhaps I should get that to Google code before I try and start something else. I'll certainly be lurking and can help review proposals of Maven poms and structures. There should be a way to specify a Maven pom that keeps the existing directory structures to minimize pain (albeit with a more complex pom, but a later directory shuffle after can still happen if needed).

            People

            • Assignee:
              Unassigned
              Reporter:
              stack
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development