Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: backlog
    • Fix Version/s: 0.8.0
    • Component/s: blueprints
    • Labels:
      None

      Description

      Lets port the BigPetStore build to gradle for all the obvious reasons. This port might cause some minor breakage of other parallel efforts that change the pom file.

      The gradle ported build should contain:

      • a "test" phase which runs the custom data set generator
      • a "integration test - pig" phase which runs the pig data cleaner + aggrergator.
      1. BIGTOP-1269.patch
        89 kB
        bhashit parikh
      2. BIGTOP-1269.patch
        82 kB
        bhashit parikh
      3. BIGTOP-1269.patch
        82 kB
        bhashit parikh

        Issue Links

          Activity

          Hide
          bhashit bhashit parikh added a comment -

          I have pretty much completed porting the pom.xml based build to a gradle based build system. On the way, I have made a few chanages (some gradelization, and some refactoring). Here's a list

          1. Added support for scala.
          2. Added support for scalatest library. Which allows us to write tests using a really cool syntax. We can test both scala and java code using it.
          3. Changed the java version to 1.7. I wanted to change to 1.8, but gradle throws tantrums when we combine the scala plugin with java 8. We'll need to wait for gradle 2 before upgrading to 1.8.
          4. The .classpath file no longer needs to be modified by hand.
          5. Integration tests for each profile are now in their own tasks. So, instead of saying something like
            gradle integrationTest -Pprofile=pig,
            we now just say
            gradle integrationTestPig
          6. I have added a -Xlint:all flag to the compiler. It warns when we stray from best practices. I refactored the integration-test classes to get rid of all the warnings. A few more classes still have those warnings. I'll take care of them afterwards.
          7. Refactored to change the way the test-classes are included or excluded for each profile.
          8. Modified the README.md to reflect the changes in the build commands.

          That's pretty much it. We are down from about 800 lines to like 160 lines or so. And, as with all software related things, there will always be a room for further improvement.

          Show
          bhashit bhashit parikh added a comment - I have pretty much completed porting the pom.xml based build to a gradle based build system. On the way, I have made a few chanages (some gradelization, and some refactoring). Here's a list Added support for scala . Added support for scalatest library. Which allows us to write tests using a really cool syntax. We can test both scala and java code using it. Changed the java version to 1.7. I wanted to change to 1.8, but gradle throws tantrums when we combine the scala plugin with java 8 . We'll need to wait for gradle 2 before upgrading to 1.8. The .classpath file no longer needs to be modified by hand. Integration tests for each profile are now in their own tasks. So, instead of saying something like gradle integrationTest -Pprofile=pig , we now just say gradle integrationTestPig I have added a -Xlint:all flag to the compiler. It warns when we stray from best practices. I refactored the integration-test classes to get rid of all the warnings. A few more classes still have those warnings. I'll take care of them afterwards. Refactored to change the way the test-classes are included or excluded for each profile. Modified the README.md to reflect the changes in the build commands. That's pretty much it. We are down from about 800 lines to like 160 lines or so. And, as with all software related things, there will always be a room for further improvement.
          Hide
          jayunit100 jay vyas added a comment -

          ahhhh yessss thanks bhashit and welcome to bigtop

          Show
          jayunit100 jay vyas added a comment - ahhhh yessss thanks bhashit and welcome to bigtop You can now make a patch using git, as shown here: https://cwiki.apache.org/confluence/display/BIGTOP/How+to+Contribute i will review it , test and it locally and make sure the jar builds correctly. From then on , we can write bigpetstore components in both scala AND java , and this will nicely pave the way for spark extensions to BPS.
          Hide
          bhashit bhashit parikh added a comment -

          Migrating from maven to gradle. Patch 1.

          Show
          bhashit bhashit parikh added a comment - Migrating from maven to gradle. Patch 1.
          Hide
          jayunit100 jay vyas added a comment - - edited

          Looks like some technical issues. but overall I think this is a pretty good patch.

          1) Trailing whitespace errors. You can easily have intellij fix this for you : See BIGTOP-1240 for details.

          2) It looks like we've lost the classpath isolation that the original maven profiles encapsulated. We will want to retain this so that all the individual integration tests can run under different classpaths. For example, if "X" relies on a new version of zookeeper and "Y" relies on a new version of zookeeper, we want X and Y to have independently scoped dependencies in their builds.

          3) The build fails when I ran it locally on my mac with gradle clean build

          Total time: 24.005 secs
          07:26:55 {master} $ gradle clean build
          :clean UP-TO-DATE
          :compileJava
          
          FAILURE: Build failed with an exception.
          
          * What went wrong:
          Could not resolve all dependencies for configuration ':compile'.
          > Could not download artifact 'org.apache.avro:avro-ipc:1.7.4:avro-ipc-tests.jar'
             > Artifact 'org.apache.avro:avro-ipc:1.7.4:avro-ipc-tests.jar' not found.
          
          * Try:
          Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.
          
          BUILD FAILED
          
          Total time: 7.945 secs
          

          All in all though this is a great first start.

          Show
          jayunit100 jay vyas added a comment - - edited Looks like some technical issues. but overall I think this is a pretty good patch. 1) Trailing whitespace errors. You can easily have intellij fix this for you : See BIGTOP-1240 for details. 2) It looks like we've lost the classpath isolation that the original maven profiles encapsulated. We will want to retain this so that all the individual integration tests can run under different classpaths. For example, if "X" relies on a new version of zookeeper and "Y" relies on a new version of zookeeper, we want X and Y to have independently scoped dependencies in their builds. 3) The build fails when I ran it locally on my mac with gradle clean build Total time: 24.005 secs 07:26:55 {master} $ gradle clean build :clean UP-TO-DATE :compileJava FAILURE: Build failed with an exception. * What went wrong: Could not resolve all dependencies for configuration ':compile'. > Could not download artifact 'org.apache.avro:avro-ipc:1.7.4:avro-ipc-tests.jar' > Artifact 'org.apache.avro:avro-ipc:1.7.4:avro-ipc-tests.jar' not found. * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED Total time: 7.945 secs All in all though this is a great first start.
          Hide
          bhashit bhashit parikh added a comment - - edited
          1. Working on it.
          2. I have added the capability for specifying the integration-test specific classpaths. I'll upload the new patch once I have dealt with the trailing-whitespace issue.
          3. This is a known issue with using mavenLocal() repo. See this issue. We need to use mavenLocal since we are building mahout-core ourselves with support for hadoop 2.x. There are two workarounds here:
            1. Delete the directory of org.apache.avro, or any other artifact that causes this problem, from the local maven repo. If we go with this, I documented this issue in the README today.
            2. We specify custom logic in our build.gradle to checkout the mahout from github, do a local build against hadoop 2.x and allow gradle to find it. A bash script or some code in groovy should be able to handle that.
          Show
          bhashit bhashit parikh added a comment - - edited Working on it. I have added the capability for specifying the integration-test specific classpaths. I'll upload the new patch once I have dealt with the trailing-whitespace issue. This is a known issue with using mavenLocal() repo. See this issue . We need to use mavenLocal since we are building mahout-core ourselves with support for hadoop 2.x . There are two workarounds here: Delete the directory of org.apache.avro , or any other artifact that causes this problem, from the local maven repo. If we go with this, I documented this issue in the README today. We specify custom logic in our build.gradle to checkout the mahout from github , do a local build against hadoop 2.x and allow gradle to find it. A bash script or some code in groovy should be able to handle that.
          Hide
          bhashit bhashit parikh added a comment - - edited

          New patch uploaded with:

          1. added support for configuring dependencies based on integration-test profile.
          2. trailing white-space removal
          3. convert spaces to tabs
          Show
          bhashit bhashit parikh added a comment - - edited New patch uploaded with: added support for configuring dependencies based on integration-test profile. trailing white-space removal convert spaces to tabs
          Hide
          jayunit100 jay vyas added a comment - - edited

          OKay looking better.

          How about this idea to get us past the mahout hurdle?

          Can we make it so that the default build is a little more explicit in terms of what source it includes, and specifically, have it not include hive and not include mahout packages as source for now ?

          Heres why:
          We can deal with mahout later in BIGTOP-1272 (by making it an optional extension to the build), and hive in BIGTOP-1270 (thinking to remove hive entirely) ?

          So, the idea is : Lets focus on making sure its (1) easy to add new modules (optionally) to bigpetstore and (2) have the main build focus simply on the generation and pig phases, for now. Then we will incrementally add back new ecosystem components in later JIRAs.

          After we do this, i will test again and then think this patch is done.

          Show
          jayunit100 jay vyas added a comment - - edited OKay looking better. How about this idea to get us past the mahout hurdle? Can we make it so that the default build is a little more explicit in terms of what source it includes, and specifically, have it not include hive and not include mahout packages as source for now ? Heres why: We can deal with mahout later in BIGTOP-1272 (by making it an optional extension to the build), and hive in BIGTOP-1270 (thinking to remove hive entirely) ? So, the idea is : Lets focus on making sure its (1) easy to add new modules (optionally) to bigpetstore and (2) have the main build focus simply on the generation and pig phases, for now. Then we will incrementally add back new ecosystem components in later JIRAs. After we do this, i will test again and then think this patch is done.
          Hide
          bhashit bhashit parikh added a comment - - edited

          I see what you are saying. However, we cannot have optional dependencies. To have real optional dependencies the way you are suggesting, we would need to mess with the source-sets to exclude/include certain packages/classes based on parameters that are specified/omitted when running gradle commands. We can do that, but if we go down that path, things will get too messy.

          I am not talking about the optional dependencies of maven. Let me explain. We can make sure that each kind of build (e.g. for pig profile) is built with a specific version of dependencies. However, if we have any source-code that depends on some external library (e.g. mahout), we can't leave out that dependency entirely. The source won't compile unless the some version of that dependency is at least specified. So, if we keep any source-code that depends on mahout, we'll need a mahout dependency. The same goes for hive. However, to get around this, and achieve what you are saying, we can just specify the publicly available versions of mahout-core as a dependency. This will allow any existing sources/tests to compile. These won't work when running against hadoop 2.x, but since we don't want to run them right now, we can just disable the integration-tests for mahout and hive for now.

          Although, based on what I understand, I think the real solution is to have each module as a subproject or something. bigpetstore being the parent project with all the common behavior and the a sub-project each for pig, hive and others. Each sub-project gets included and excluded based on the parameters that we provide, or through the settings.gradle file. This way, we can have a clean separation of configurations. I am not sure whether this is what you want. If we want to do this, I'll need to discuss a few more things with you and think it through.

          Does this make sense?

          Show
          bhashit bhashit parikh added a comment - - edited I see what you are saying. However, we cannot have optional dependencies. To have real optional dependencies the way you are suggesting, we would need to mess with the source-sets to exclude/include certain packages/classes based on parameters that are specified/omitted when running gradle commands. We can do that, but if we go down that path, things will get too messy. I am not talking about the optional dependencies of maven . Let me explain. We can make sure that each kind of build (e.g. for pig profile) is built with a specific version of dependencies. However, if we have any source-code that depends on some external library (e.g. mahout ), we can't leave out that dependency entirely. The source won't compile unless the some version of that dependency is at least specified. So, if we keep any source-code that depends on mahout , we'll need a mahout dependency. The same goes for hive . However, to get around this, and achieve what you are saying, we can just specify the publicly available versions of mahout-core as a dependency. This will allow any existing sources/tests to compile. These won't work when running against hadoop 2.x , but since we don't want to run them right now, we can just disable the integration-tests for mahout and hive for now. Although, based on what I understand, I think the real solution is to have each module as a subproject or something. bigpetstore being the parent project with all the common behavior and the a sub-project each for pig , hive and others. Each sub-project gets included and excluded based on the parameters that we provide, or through the settings.gradle file. This way, we can have a clean separation of configurations. I am not sure whether this is what you want. If we want to do this, I'll need to discuss a few more things with you and think it through. Does this make sense?
          Hide
          jayunit100 jay vyas added a comment -

          Okay, I see how sourceSets could get ugly.

          As the mahout and hive code is really not useable and somewhat trivial as of yet, how about we just remove both?

          Then we can add them back in in our subsequent jiras as atomic patches in the two jiras I mentioned.?

          Do you feel comfortable removing the hive and mahout stuff entirely for a final cleanup patch?

          Show
          jayunit100 jay vyas added a comment - Okay, I see how sourceSets could get ugly. As the mahout and hive code is really not useable and somewhat trivial as of yet, how about we just remove both? Then we can add them back in in our subsequent jiras as atomic patches in the two jiras I mentioned.? Do you feel comfortable removing the hive and mahout stuff entirely for a final cleanup patch?
          Hide
          bhashit bhashit parikh added a comment -

          Yeah, I can do that. WIll upload the patch soon.

          Show
          bhashit bhashit parikh added a comment - Yeah, I can do that. WIll upload the patch soon.
          Hide
          bhashit bhashit parikh added a comment -

          Uploaded patch with some cleanup. Removed hive and mahout code and dependencies. Updated README to reflect that.

          Show
          bhashit bhashit parikh added a comment - Uploaded patch with some cleanup. Removed hive and mahout code and dependencies. Updated README to reflect that.
          Hide
          jayunit100 jay vyas added a comment -

          Cool!

          I'll review and test this shortly and let you know if it works.

          Show
          jayunit100 jay vyas added a comment - Cool! I'll review and test this shortly and let you know if it works.
          Hide
          jayunit100 jay vyas added a comment -

          Alas, adding scala hooks now makes it fail in my new machine - i think the error is on my end with my scala set up (ScalaDoc is missing?) looking......

          Show
          jayunit100 jay vyas added a comment - Alas, adding scala hooks now makes it fail in my new machine - i think the error is on my end with my scala set up (ScalaDoc is missing?) looking......
          Hide
          jayunit100 jay vyas added a comment -

          +1 will commit shortly unless anyone objects to this? Here are my notes on the review.

          First, ,i had to download and Export JAVA_HOME=path_to_java7_home fixed the issue. I guess there is an issue w/ gradle and scala on java 8 currently.

          Anyways - building the jar worked.. I tested 3 bits of core functionality.

          1) gradle tasks

          This printed out all the tasks cleanly. Would be cool to customize it in the future to print out the task incantations in your README updates.

          2) gradle build integrationTest

          This also worked : but integrationTest SKIPPED because there was no profile. I suggest in the future we put an error message here rather than skipping all tests. After all , there is no point in running an integrationTest without a ITProfile parameter.

          3) gradle clean integrationTest -P ITProfile=pig

          Ran the pig flow. I saw the data in my local bps_integration after and it looks good.

          So, I'll commit this shortly !

          So we've preserved functionality and dropped 100s of lines of code. Awesome !

          • BigPetStore now builds with gradle
          • BigPetStore now supports scala

          Thanks bhashit ! looking forward to more contributions to come . Will commit shortly while i port apache credentials to my new machine.

          Show
          jayunit100 jay vyas added a comment - +1 will commit shortly unless anyone objects to this? Here are my notes on the review. First, ,i had to download and Export JAVA_HOME=path_to_java7_home fixed the issue. I guess there is an issue w/ gradle and scala on java 8 currently. Anyways - building the jar worked.. I tested 3 bits of core functionality. 1) gradle tasks This printed out all the tasks cleanly. Would be cool to customize it in the future to print out the task incantations in your README updates. 2) gradle build integrationTest This also worked : but integrationTest SKIPPED because there was no profile. I suggest in the future we put an error message here rather than skipping all tests. After all , there is no point in running an integrationTest without a ITProfile parameter. 3) gradle clean integrationTest -P ITProfile=pig Ran the pig flow. I saw the data in my local bps_integration after and it looks good. So, I'll commit this shortly ! So we've preserved functionality and dropped 100s of lines of code. Awesome ! BigPetStore now builds with gradle BigPetStore now supports scala Thanks bhashit ! looking forward to more contributions to come . Will commit shortly while i port apache credentials to my new machine.
          Hide
          jayunit100 jay vyas added a comment -

          and commited ..........

          Show
          jayunit100 jay vyas added a comment - and commited ..........
          Hide
          bhashit bhashit parikh added a comment - - edited

          Awesome.

          I'll add the java 7 required part, as well as any other thing that I can think of, to the README later.

          Show
          bhashit bhashit parikh added a comment - - edited Awesome. I'll add the java 7 required part, as well as any other thing that I can think of, to the README later.
          Hide
          cos Konstantin Boudnik added a comment -

          Guys, please do not add 30 odd lines of the comment to the commits

          Show
          cos Konstantin Boudnik added a comment - Guys, please do not add 30 odd lines of the comment to the commits
          Hide
          cos Konstantin Boudnik added a comment - - edited

          Also, a commit comment should contain the number and summary of the relevant JIRA ticket. E.g. in this case it would say something like
          BIGTOP-1269. BigPetStore: Create build w/ gradle

          Show
          cos Konstantin Boudnik added a comment - - edited Also, a commit comment should contain the number and summary of the relevant JIRA ticket. E.g. in this case it would say something like BIGTOP-1269 . BigPetStore: Create build w/ gradle
          Hide
          cos Konstantin Boudnik added a comment -

          Another one - the implemented gradle build seems to be disconnected from the top level gradle. Is it intentional?

          Show
          cos Konstantin Boudnik added a comment - Another one - the implemented gradle build seems to be disconnected from the top level gradle. Is it intentional?
          Hide
          cos Konstantin Boudnik added a comment -

          Another one - the implemented gradle build seems to be disconnected from the top level gradle. Is it intentional?

          Last but not least - reviewers, please make sure to leave a formal +1 in your comment.

          Show
          cos Konstantin Boudnik added a comment - Another one - the implemented gradle build seems to be disconnected from the top level gradle. Is it intentional? Last but not least - reviewers, please make sure to leave a formal +1 in your comment.
          Hide
          jayunit100 jay vyas added a comment -

          Oops on the commit lines .
          Regarding the top level gradle build: this patch inherits meta info from top level Pom.

          *My thoughts are that the the top level pom is still the main overall project defining file:* is that going to change? When are we going to phase that Pom out completely and replace it with gradle?

          And also ... Just Let me know If I should to do anything retroactively to repair the commit log. I'll get to it.

          Show
          jayunit100 jay vyas added a comment - Oops on the commit lines . Regarding the top level gradle build: this patch inherits meta info from top level Pom. *My thoughts are that the the top level pom is still the main overall project defining file:* is that going to change? When are we going to phase that Pom out completely and replace it with gradle? And also ... Just Let me know If I should to do anything retroactively to repair the commit log. I'll get to it.
          Hide
          jayunit100 jay vyas added a comment -

          FYI I've updated https://cwiki.apache.org/confluence/display/BIGTOP/How+to+Contribute with a "Reviewing patches" section. bhashit parikh you can follow those guidelines if your going to put in another bigtop patch. thanks !

          Show
          jayunit100 jay vyas added a comment - FYI I've updated https://cwiki.apache.org/confluence/display/BIGTOP/How+to+Contribute with a "Reviewing patches" section. bhashit parikh you can follow those guidelines if your going to put in another bigtop patch. thanks !

            People

            • Assignee:
              Unassigned
              Reporter:
              jayunit100 jay vyas
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development