Avro
  1. Avro
  2. AVRO-716

New Java build: integrate with parent build and remove cruft

    Details

    • Type: Task Task
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.5.0
    • Component/s: build, java
    • Labels:
      None

      Description

      We have legacy ant and ivy items in lang/java that need to be removed or reduced.

      'ant clean' 'and compile' and 'ant test' can simply forward to maven, we might still want to have ant drive interop tests, and it is useful to have ant around as a tool.

      Additionally, buildbot currently fails and build.sh from the top level does not properly build Java.

      1. AVRO-716.v1.patch
        28 kB
        Scott Carey
      2. AVRO-716.v2.patch
        110 kB
        Scott Carey
      3. AVRO-716.v3.patch
        30 kB
        Scott Carey
      4. mod_build.v4.sh
        0.3 kB
        Scott Carey
      5. AVRO-716.v4.patch
        32 kB
        Scott Carey

        Activity

        Hide
        Doug Cutting added a comment -

        We also need to get "rat" to run on the content of the source tarball. This is currently done by Java's build.xml. We could leave that there.

        As you indicate above, we might also keep driving the interop tests with Ant. The complication I see is building an appropriate classpath that includes the dependencies. Any ideas of how to do that? Should we have Ivy read the pom files and download the jars? Is there a way to get Maven to put all the dependencies in a directory?

        Show
        Doug Cutting added a comment - We also need to get "rat" to run on the content of the source tarball. This is currently done by Java's build.xml. We could leave that there. As you indicate above, we might also keep driving the interop tests with Ant. The complication I see is building an appropriate classpath that includes the dependencies. Any ideas of how to do that? Should we have Ivy read the pom files and download the jars? Is there a way to get Maven to put all the dependencies in a directory?
        Hide
        Ken Krugler added a comment -

        If you're trying to use Ant, but specify dependencies via a pom file, then yes it's possible - that's what we do in the Bixo project. You need the maven-ant-tasks jar (we use version 2.0.10), which could be auto-downloaded if you wanted to have no lib directory in the project. Then there's a somewhat funky target that needs to be added to the build.xml, after that it's straightforward. See the mvn-init target in https://github.com/bixo/bixo/blob/master/build.xml for an example.

        Show
        Ken Krugler added a comment - If you're trying to use Ant, but specify dependencies via a pom file, then yes it's possible - that's what we do in the Bixo project. You need the maven-ant-tasks jar (we use version 2.0.10), which could be auto-downloaded if you wanted to have no lib directory in the project. Then there's a somewhat funky target that needs to be added to the build.xml, after that it's straightforward. See the mvn-init target in https://github.com/bixo/bixo/blob/master/build.xml for an example.
        Hide
        Scott Carey added a comment -

        We also need to get "rat" to run on the content of the source tarball. This is currently done by Java's build.xml. We could leave that there.

        That is probably the easiest thing to do, unless maven's RAT plugin works with the exclusion list easily. I got it working quickly on the Java tree, but it complains about a few things that need to be excluded. Does the current java ant task run RAT for all languages? If so, it should remain an ant task since Maven won't run RAT outside of its directory tree.

        As you indicate above, we might also keep driving the interop tests with Ant. The complication I see is building an appropriate classpath that includes the dependencies. Any ideas of how to do that? Should we have Ivy read the pom files and download the jars? Is there a way to get Maven to put all the dependencies in a directory?

        We could, but it might be easier to use the Maven facilities for this. It is easy to have it copy all the dependent jars in one place. Or, we can use the 'failsafe' plugin to run the interop tests as integration tests. Or, configure a profile with them as a separate set of Unit tests with surefire and activate that profile from the command line after the build "mvn test -P interop". I am planning on working on this ticket early next week, along with the other main tasks to complete the maven migration.

        Show
        Scott Carey added a comment - We also need to get "rat" to run on the content of the source tarball. This is currently done by Java's build.xml. We could leave that there. That is probably the easiest thing to do, unless maven's RAT plugin works with the exclusion list easily. I got it working quickly on the Java tree, but it complains about a few things that need to be excluded. Does the current java ant task run RAT for all languages? If so, it should remain an ant task since Maven won't run RAT outside of its directory tree. As you indicate above, we might also keep driving the interop tests with Ant. The complication I see is building an appropriate classpath that includes the dependencies. Any ideas of how to do that? Should we have Ivy read the pom files and download the jars? Is there a way to get Maven to put all the dependencies in a directory? We could, but it might be easier to use the Maven facilities for this. It is easy to have it copy all the dependent jars in one place. Or, we can use the 'failsafe' plugin to run the interop tests as integration tests. Or, configure a profile with them as a separate set of Unit tests with surefire and activate that profile from the command line after the build "mvn test -P interop". I am planning on working on this ticket early next week, along with the other main tasks to complete the maven migration.
        Hide
        Doug Cutting added a comment -

        > Does the current java ant task run RAT for all languages?

        Yes. It is run on the full source tree by the top-level 'dist' target before building the source tarball. It only lives in Java's build.xml for historic reasons, and because it shared some logic (ivy stuff mostly) with Java-specific targets. If it's the only target remaining then it should probably move to the top-level.

        > I am planning on working on this ticket early next week [ ... ]

        Great! Thanks!

        Show
        Doug Cutting added a comment - > Does the current java ant task run RAT for all languages? Yes. It is run on the full source tree by the top-level 'dist' target before building the source tarball. It only lives in Java's build.xml for historic reasons, and because it shared some logic (ivy stuff mostly) with Java-specific targets. If it's the only target remaining then it should probably move to the top-level. > I am planning on working on this ticket early next week [ ... ] Great! Thanks!
        Hide
        Scott Carey added a comment -

        Additionally we need to build -javadoc and -source jars. That should be easy.

        Show
        Scott Carey added a comment - Additionally we need to build -javadoc and -source jars. That should be easy.
        Hide
        Scott Carey added a comment -

        The Interop tests use RandomData in order to generate binary data for testing.

        This is used in a few places and we already have two copies of the class. I think it makes sense to move this from the test tree to o.a.a.util and have a public API for it. We can then make a Tool and use avro-tools.jar to create the interop data. Generating random data with the command line tool has many other uses too.

        Does that sound reasonable? If we do this, I think it makes sense to put the Tool portion into a different ticket since it will need its own section in CHANGES.txt.

        Show
        Scott Carey added a comment - The Interop tests use RandomData in order to generate binary data for testing. This is used in a few places and we already have two copies of the class. I think it makes sense to move this from the test tree to o.a.a.util and have a public API for it. We can then make a Tool and use avro-tools.jar to create the interop data. Generating random data with the command line tool has many other uses too. Does that sound reasonable? If we do this, I think it makes sense to put the Tool portion into a different ticket since it will need its own section in CHANGES.txt.
        Hide
        Doug Cutting added a comment -

        I'm fine adding RandomData as a utility class, especially if a we base a command-line tool on it. That said, is there no way with Maven to share test utilities between modules without adding them to the primary public artifacts?

        Show
        Doug Cutting added a comment - I'm fine adding RandomData as a utility class, especially if a we base a command-line tool on it. That said, is there no way with Maven to share test utilities between modules without adding them to the primary public artifacts?
        Hide
        Scott Carey added a comment -

        There is a way to share test utilities. We just create a -tests artifact that downstream artifacts can declare in test scope.

        Example: http://www.waltercedric.com/java-j2ee-mainmenu-53/361-maven-build-system/1349-maven-reusing-test-classes-across-multi-modules-projects.html

        Making this a tool however makes it easier to do the interop tests by simply using the tool in avro-tools.jar to generate the data. I suppose calling a main method on RandomData from avro-1.5.0-tests.jar would be easy as well.

        Show
        Scott Carey added a comment - There is a way to share test utilities. We just create a -tests artifact that downstream artifacts can declare in test scope. Example: http://www.waltercedric.com/java-j2ee-mainmenu-53/361-maven-build-system/1349-maven-reusing-test-classes-across-multi-modules-projects.html Making this a tool however makes it easier to do the interop tests by simply using the tool in avro-tools.jar to generate the data. I suppose calling a main method on RandomData from avro-1.5.0-tests.jar would be easy as well.
        Hide
        Scott Carey added a comment -

        this patch addressed remaining build integration:

        • In lang/java/ it changes the pom.xml files to have a 'release' profile for building javadoc and source jars. To build these, add the command line argument '-P release' to a maven command that completes at least the 'package' build phase.
        • In lang/java/ it adds an 'interop-data-test' profile that changes the behavior or 'mvn test' to execute only the interop data tests.
        • Moves the interop test out from an inner class to its own class.
        • Creates a jar artifact from the 'core' project tests for use by downstream projects to avoid duplicating RandomData and AvroTestUtil.
        • Adds a couple missing apache license headers.
        • Enables RAT on the top level. This is done by using maven to call the RAT ant task. An associated pom.xml is now at the root of the project. This was easier than using ant + ivy to fetch dependencies; the pom.xml is small. The maven-rat-plugin did not seem to like to use the exclude file and had some configuration bugs.
        • share/rat-excludes.txt needed some minor changes. I think we might be able to configure RAT to accept the many MIT licenses that cause it to fail rather than excluding them later, but I did not try and figure that out.
        • Modified build.sh and share/test/interop/bin/test_rpc_interop.sh for the above features.

        Notes:

        1. TestFileSpanStorage was failing for me about 2/3 of the time, I modified one of its sleep calls to be 200ms instead of 100ms so that it passes consistently. This test needs an overhaul later – it should not sleep at all but instead wait for signals. Sleeps in unit tests usually indicate that the API of the things being tested are not complete and need methods to await asynchronous completion of tasks.
        2. build.sh uses the bash subscript style extensively. This breaks the '-e' fail-on-error policy with bash 3.x For example:
          set -e; echo start; (exit 1); echo never;

          prints both 'start' and 'never' for me. On bash 4.x it would not print 'never'. Essentially, the error result of the subscript does not trigger the -e handling for all users. We might have to change it to avoid subshells or manually check $? after each one. I did not address that here, but several commands I modified are not in a subscript.

        Show
        Scott Carey added a comment - this patch addressed remaining build integration: In lang/java/ it changes the pom.xml files to have a 'release' profile for building javadoc and source jars. To build these, add the command line argument '-P release' to a maven command that completes at least the 'package' build phase. In lang/java/ it adds an 'interop-data-test' profile that changes the behavior or 'mvn test' to execute only the interop data tests. Moves the interop test out from an inner class to its own class. Creates a jar artifact from the 'core' project tests for use by downstream projects to avoid duplicating RandomData and AvroTestUtil. Adds a couple missing apache license headers. Enables RAT on the top level. This is done by using maven to call the RAT ant task. An associated pom.xml is now at the root of the project. This was easier than using ant + ivy to fetch dependencies; the pom.xml is small. The maven-rat-plugin did not seem to like to use the exclude file and had some configuration bugs. share/rat-excludes.txt needed some minor changes. I think we might be able to configure RAT to accept the many MIT licenses that cause it to fail rather than excluding them later, but I did not try and figure that out. Modified build.sh and share/test/interop/bin/test_rpc_interop.sh for the above features. Notes: TestFileSpanStorage was failing for me about 2/3 of the time, I modified one of its sleep calls to be 200ms instead of 100ms so that it passes consistently. This test needs an overhaul later – it should not sleep at all but instead wait for signals. Sleeps in unit tests usually indicate that the API of the things being tested are not complete and need methods to await asynchronous completion of tasks. build.sh uses the bash subscript style extensively. This breaks the '-e' fail-on-error policy with bash 3.x For example: set -e; echo start; (exit 1); echo never; prints both 'start' and 'never' for me. On bash 4.x it would not print 'never'. Essentially, the error result of the subscript does not trigger the -e handling for all users. We might have to change it to avoid subshells or manually check $? after each one. I did not address that here, but several commands I modified are not in a subscript.
        Hide
        Doug Cutting added a comment -

        Looks great, Scott!

        A few things I noticed:

        • perhaps we should create a lang/java/build.sh that contains the logic for each of the top-level targets, so that the top-level just has something simple like '(cd lang/java; ./build.sh dist)'?
        • i just realized that all of the pom.xml files have the version (1.5.0-SNAPSHOT) duplicated. prior to the switch to maven, the Avro version was only in a single place, share/VERSION.txt at the top-level. perhaps lang/java/build.sh could invoke 'mvn -Davro.version=`cat ../../share/VERSION.txt`', and the pom.xml files could be changed to refer to $avro.version? that would mean that running 'mvn' from the command line might not work though. or maybe we could use the maven properties plugin to have all of the poms read from a common properties file that's generated from share/VERSION.txt...
        Show
        Doug Cutting added a comment - Looks great, Scott! A few things I noticed: perhaps we should create a lang/java/build.sh that contains the logic for each of the top-level targets, so that the top-level just has something simple like '(cd lang/java; ./build.sh dist)'? i just realized that all of the pom.xml files have the version (1.5.0-SNAPSHOT) duplicated. prior to the switch to maven, the Avro version was only in a single place, share/VERSION.txt at the top-level. perhaps lang/java/build.sh could invoke 'mvn -Davro.version=`cat ../../share/VERSION.txt`', and the pom.xml files could be changed to refer to $avro.version? that would mean that running 'mvn' from the command line might not work though. or maybe we could use the maven properties plugin to have all of the poms read from a common properties file that's generated from share/VERSION.txt...
        Hide
        Scott Carey added a comment -

        perhaps we should create a lang/java/build.sh

        Good idea. I might be able to avoid the subshell too (and thus, let '-e' work).

        on POM versions:

        The maven-release-plugin can be used to change the version in the poms during release. I believe it would be something like

        mvn -B release:update-versions -DautoVersionSubmodules=true -DdevelopmentVersion=`cat ../../share/VERSION.txt`
        

        To change the versions to 1.5.0. Or, without the -B command it will prompt for the version interactively.

        I'll look into what the maven-properties-plugin can do. That might be simpler – most of the maven tools assume that that the whole build is within maven. And we're never realistically going to have the same build system across 5+ languages.

        Show
        Scott Carey added a comment - perhaps we should create a lang/java/build.sh Good idea. I might be able to avoid the subshell too (and thus, let '-e' work). on POM versions: The maven-release-plugin can be used to change the version in the poms during release. I believe it would be something like mvn -B release:update-versions -DautoVersionSubmodules= true -DdevelopmentVersion=`cat ../../share/VERSION.txt` To change the versions to 1.5.0. Or, without the -B command it will prompt for the version interactively. I'll look into what the maven-properties-plugin can do. That might be simpler – most of the maven tools assume that that the whole build is within maven. And we're never realistically going to have the same build system across 5+ languages.
        Hide
        Scott Carey added a comment -

        AVRO-716.v2.patch

        Ties new java build in with parent build. Several improvements over the last version. Interop tests function correctly now as well.

        Also removes ivy and ant related files from lang/java and removes lang/java/lib.

        There were some licence files in lang/java/lib but they were all out of date, and we don't have any jars checked into svn. I don't think we need these.

        The top level build uses the maven-enforcer-plugin during ./build.sh dist to enforce that the versions of the artifacts match share/VERSION.txt. The dist target will fail if they do not. During a branch or tag process, we can use the maven-versions-plugin to change all the version numbers at the same time that we change VERSION.txt.

        Show
        Scott Carey added a comment - AVRO-716 .v2.patch Ties new java build in with parent build. Several improvements over the last version. Interop tests function correctly now as well. Also removes ivy and ant related files from lang/java and removes lang/java/lib. There were some licence files in lang/java/lib but they were all out of date, and we don't have any jars checked into svn. I don't think we need these. The top level build uses the maven-enforcer-plugin during ./build.sh dist to enforce that the versions of the artifacts match share/VERSION.txt. The dist target will fail if they do not. During a branch or tag process, we can use the maven-versions-plugin to change all the version numbers at the same time that we change VERSION.txt.
        Hide
        Scott Carey added a comment -

        Updated patch using --no-diff-deleted to reduce its size. No other difference from the prior patch.

        Patch applies to a clean checkout fine, and 'build.sh clean test dist' works.

        Show
        Scott Carey added a comment - Updated patch using --no-diff-deleted to reduce its size. No other difference from the prior patch. Patch applies to a clean checkout fine, and 'build.sh clean test dist' works.
        Hide
        Doug Cutting added a comment -

        'build.sh dist' is failing for me with:

        [ERROR] Failed to execute goal on project avro-compiler: Could not resolve dependencies for project org.apache.avro:avro-compiler:jar:1.5.0-SNAPSHOT: Could not find artifact org.apache.avro:avro:jar:1.5.0-SNAPSHOT in jboss-public-repository (http://repository.jboss.org/nexus/content/groups/public/) -> [Help 1]
        [ERROR]

        This works if I first run 'cd lang/java; mvn install'. Is there a way to not require this? I prefer not to have a snapshot installed locally.

        Show
        Doug Cutting added a comment - 'build.sh dist' is failing for me with: [ERROR] Failed to execute goal on project avro-compiler: Could not resolve dependencies for project org.apache.avro:avro-compiler:jar:1.5.0-SNAPSHOT: Could not find artifact org.apache.avro:avro:jar:1.5.0-SNAPSHOT in jboss-public-repository ( http://repository.jboss.org/nexus/content/groups/public/ ) -> [Help 1] [ERROR] This works if I first run 'cd lang/java; mvn install'. Is there a way to not require this? I prefer not to have a snapshot installed locally.
        Hide
        Scott Carey added a comment -

        I'll run through test cases where I blow away my local repo and run each target individually.

        I don't know if there is a way to avoid installing to the local repo. I hadn't considered that since we're only a couple MB.

        Show
        Scott Carey added a comment - I'll run through test cases where I blow away my local repo and run each target individually. I don't know if there is a way to avoid installing to the local repo. I hadn't considered that since we're only a couple MB.
        Hide
        Doug Cutting added a comment -

        I dislike installing to the local repo unless I'm developing one project against an unreleased version of another. I worry that as I'm working on different issues within a single project that things will get confused when the common local repo is used. Should I just get over it?

        Show
        Doug Cutting added a comment - I dislike installing to the local repo unless I'm developing one project against an unreleased version of another. I worry that as I'm working on different issues within a single project that things will get confused when the common local repo is used. Should I just get over it?
        Hide
        Scott Carey added a comment -

        That can be an issue. Generally it is not.

        For example – if you have (like me)
        src/avro-716/trunk
        src/avro-737/trunk
        src/avro-perf/trunk

        and are using all three at the same time, you might mix/match things. There are a few solutions:

        • build from the top level – it only uses snapshots in your if the code required is not in the 'reactor' of the build. Note, this may be less reliable (buggy) for the avro-maven-plugin. I've seen it use the local repo over the reactor for the plugin once or twice, but more inside of the m2eclipse plugin than command line.
        • use 'install' more regularly – if your work is interleaved, this should work fine.
        • change your version locally with mvn versions:set to create distinct artifacts.

        I'll explore this a little more to see what parts of our build are sensitive to the repo and what parts aren't. There are a couple places where we call directly to a sub-project instead of the parent, which will trigger dependency on the local repo over the reactor. The build will be a little slower if we use the top level instead, but probably more stable.

        Show
        Scott Carey added a comment - That can be an issue. Generally it is not. For example – if you have (like me) src/avro-716/trunk src/avro-737/trunk src/avro-perf/trunk and are using all three at the same time, you might mix/match things. There are a few solutions: build from the top level – it only uses snapshots in your if the code required is not in the 'reactor' of the build. Note, this may be less reliable (buggy) for the avro-maven-plugin. I've seen it use the local repo over the reactor for the plugin once or twice, but more inside of the m2eclipse plugin than command line. use 'install' more regularly – if your work is interleaved, this should work fine. change your version locally with mvn versions:set to create distinct artifacts. I'll explore this a little more to see what parts of our build are sensitive to the repo and what parts aren't. There are a couple places where we call directly to a sub-project instead of the parent, which will trigger dependency on the local repo over the reactor. The build will be a little slower if we use the top level instead, but probably more stable.
        Hide
        Scott Carey added a comment -

        Updated patch and script.

        Run the patch before the script.

        Changes:
        ./build.sh clean test dist

        does not use the local repo to publish artifacts at any time. Script includes deletion of old build items and change in svn:ignore.

        With a change like this, do we need to modify CHANGES.txt? Or is the comment from the parent AVRO-647 sufficient? The most we can say is "Java: integrate AVRO-647 changes with top level build".

        Show
        Scott Carey added a comment - Updated patch and script. Run the patch before the script. Changes: ./build.sh clean test dist does not use the local repo to publish artifacts at any time. Script includes deletion of old build items and change in svn:ignore. With a change like this, do we need to modify CHANGES.txt? Or is the comment from the parent AVRO-647 sufficient? The most we can say is "Java: integrate AVRO-647 changes with top level build".
        Hide
        Doug Cutting added a comment -

        +1 This looks good to me. I think adding the one-line message you propose to CHANGES.txt would be good.

        Show
        Doug Cutting added a comment - +1 This looks good to me. I think adding the one-line message you propose to CHANGES.txt would be good.
        Hide
        Scott Carey added a comment -

        Committed in revision 1066155.

        Show
        Scott Carey added a comment - Committed in revision 1066155.

          People

          • Assignee:
            Scott Carey
            Reporter:
            Scott Carey
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development