Pig
  1. Pig
  2. PIG-2362

Rework Ant build.xml to use macrodef instead of antcall

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.0
    • Component/s: None
    • Labels:
      None

      Description

      Antcall is evil: http://www.build-doctor.com/2008/03/13/antcall-is-evil/
      We'd better use macrodef and let Ant build a clean dependency graph.
      http://ant.apache.org/manual/Tasks/macrodef.html

      Right now we do like this:

      <target name="buildAllJars">
        <antcall target="buildJar">
          <param name="build.dir" value="jar-A"/>
        </antcall>
        <antcall target="buildJar">
          <param name="build.dir" value="jar-B"/>
        </antcall>
        <antcall target="buildJar">
          <param name="build.dir" value="jar-C"/>
        </antcall>
      </target>
      <target name="buildJar">
        <jar destfile="target/${build.dir}.jar" basedir="${build.dir}/classfiles"/>
      </target>
      

      But it would be better if we did like this:

      <target name="buildAllJars">
        <buildJar build.dir="jar-A"/>
        <buildJar build.dir="jar-B"/>
        <buildJar build.dir="jar-C"/>
      </target>
      
      <macrodef name="buildJar">
        <attribute name="build.dir"/>
        <jar destfile="target/${build.dir}.jar" basedir="${build.dir}/classfiles"/>
      </macrodef>
      
      1. PIG-2362.10.patch
        75 kB
        Cheolsoo Park
      2. PIG-2362.9.patch.nowhitespace
        28 kB
        Cheolsoo Park
      3. PIG-2362.9.patch
        76 kB
        Cheolsoo Park
      4. PIG-2362.8.patch
        31 kB
        Cheolsoo Park
      5. PIG-2362.7.patch
        31 kB
        Cheolsoo Park
      6. PIG-2362.6.patch
        29 kB
        Gianmarco De Francisci Morales
      7. PIG-2362.5.patch
        30 kB
        Gianmarco De Francisci Morales
      8. PIG-2362.4.patch
        29 kB
        Gianmarco De Francisci Morales
      9. PIG-2362.3.patch
        29 kB
        Gianmarco De Francisci Morales
      10. PIG-2362.2.patch
        29 kB
        Gianmarco De Francisci Morales
      11. PIG-2362.1.patch
        14 kB
        Gianmarco De Francisci Morales

        Issue Links

          Activity

          Hide
          Gianmarco De Francisci Morales added a comment -

          First attempt PIG-2362.1.patch.
          I removed much of the duplicated code and the bloat in build.xml
          There are still more <antcall> to evict.

          This patch also removes 1 useless call to ivy:resolve.

          I just wanted to have some feedback from the community before I go on.

          Also, do we still need the JavaCC part now that we use ANTLR?

          Show
          Gianmarco De Francisci Morales added a comment - First attempt PIG-2362 .1.patch. I removed much of the duplicated code and the bloat in build.xml There are still more <antcall> to evict. This patch also removes 1 useless call to ivy:resolve. I just wanted to have some feedback from the community before I go on. Also, do we still need the JavaCC part now that we use ANTLR?
          Hide
          Alan Gates added a comment -

          +1, patch looks good. Runs fine with jar, jar-withouthadoop, clean, test, docs, test-e2e, and package

          Show
          Alan Gates added a comment - +1, patch looks good. Runs fine with jar, jar-withouthadoop, clean, test, docs, test-e2e, and package
          Hide
          Gianmarco De Francisci Morales added a comment -

          Attaching PIG-2362.2.patch
          Removed all ant-calls.
          Tested with jar, jar-withouthadoop, test, test-e2e, package, docs.

          The patch is ready for review.

          Show
          Gianmarco De Francisci Morales added a comment - Attaching PIG-2362 .2.patch Removed all ant-calls. Tested with jar, jar-withouthadoop, test, test-e2e, package, docs. The patch is ready for review.
          Hide
          Gianmarco De Francisci Morales added a comment -

          Attaching PIG-2362.3.patch
          Rebased patch on latest trunk.
          Ready for review.

          Show
          Gianmarco De Francisci Morales added a comment - Attaching PIG-2362 .3.patch Rebased patch on latest trunk. Ready for review.
          Hide
          Daniel Dai added a comment -

          I guess you need to resync again

          Show
          Daniel Dai added a comment - I guess you need to resync again
          Hide
          Gianmarco De Francisci Morales added a comment -

          Rebased once again

          Show
          Gianmarco De Francisci Morales added a comment - Rebased once again
          Hide
          Daniel Dai added a comment -

          Seems "include-meta" is not being called after patch? This is necessary for hadoop 23.

          Show
          Daniel Dai added a comment - Seems "include-meta" is not being called after patch? This is necessary for hadoop 23.
          Hide
          Gianmarco De Francisci Morales added a comment -

          Fixed the issue for "include-meta" and removed another antcall in the process.
          I am waiting for PIG-2349 to be committed in order to rebase the patch as they will most likely step on each other's toes.

          Show
          Gianmarco De Francisci Morales added a comment - Fixed the issue for "include-meta" and removed another antcall in the process. I am waiting for PIG-2349 to be committed in order to rebase the patch as they will most likely step on each other's toes.
          Hide
          Gianmarco De Francisci Morales added a comment -

          Attaching PIG-2362.5.patch
          This solves the include-meta issue, even though I had to keep the antcall for it.
          I will refactor it in a separate ticket.

          Show
          Gianmarco De Francisci Morales added a comment - Attaching PIG-2362 .5.patch This solves the include-meta issue, even though I had to keep the antcall for it. I will refactor it in a separate ticket.
          Hide
          Daniel Dai added a comment -

          Seems still get some problem with hadoop 23 local mode. To reproduce, run Pig with hadoop23 in local mode:
          ant -Dhadoopversion=23
          pig -x local

          Then run a trivial load/dump.

          Here is the error I see:
          Caused by: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
          at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:123)
          at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:85)
          at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:78)
          at org.apache.hadoop.mapred.JobClient.init(JobClient.java:476)
          at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:455)
          at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:155)
          at org.apache.pig.PigServer.launchPlan(PigServer.java:1271)
          ... 11 more

          Seems it still has something to do with include-meta.

          Show
          Daniel Dai added a comment - Seems still get some problem with hadoop 23 local mode. To reproduce, run Pig with hadoop23 in local mode: ant -Dhadoopversion=23 pig -x local Then run a trivial load/dump. Here is the error I see: Caused by: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:123) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:85) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:78) at org.apache.hadoop.mapred.JobClient.init(JobClient.java:476) at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:455) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:155) at org.apache.pig.PigServer.launchPlan(PigServer.java:1271) ... 11 more Seems it still has something to do with include-meta.
          Hide
          Gianmarco De Francisci Morales added a comment -

          Switching to Maven renders this Jira obsolete.

          Show
          Gianmarco De Francisci Morales added a comment - Switching to Maven renders this Jira obsolete.
          Hide
          Gianmarco De Francisci Morales added a comment -

          Switching to Maven renders this Jira obsolete.

          Show
          Gianmarco De Francisci Morales added a comment - Switching to Maven renders this Jira obsolete.
          Hide
          Dmitriy V. Ryaboy added a comment -

          Gianmarco, since we have made no progress on mavenizing Pig.. mind taking another crack at this? Seems like we were almost there.

          Show
          Dmitriy V. Ryaboy added a comment - Gianmarco, since we have made no progress on mavenizing Pig.. mind taking another crack at this? Seems like we were almost there.
          Hide
          Gianmarco De Francisci Morales added a comment -

          Too bad we didn't manage to mavenize Pig.
          Yes, I will take another crack at it later this week.

          Show
          Gianmarco De Francisci Morales added a comment - Too bad we didn't manage to mavenize Pig. Yes, I will take another crack at it later this week.
          Hide
          Gianmarco De Francisci Morales added a comment -

          On trunk, if I compile with -Dhadoopversion=23 and I launch bin/pig -x local I get the following error repeated continuously:

          2012-09-17 22:44:54,287 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Could not initialize class org.apache.pig.tools.pigstats.PigStatsUtil
          Details at logfile: /Users/gdfm/workspace/pig/git/pig_1347914693509.log
          

          Here the stack trace:

          Pig Stack Trace
          ---------------
          ERROR 2998: Unhandled internal error. org/apache/hadoop/mapreduce/task/JobContextImpl
          
          java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/task/JobContextImpl
                  at org.apache.pig.tools.pigstats.PigStatsUtil.<clinit>(PigStatsUtil.java:54)
                  at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:67)
                  at org.apache.pig.Main.run(Main.java:538)
                  at org.apache.pig.Main.main(Main.java:154)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                  at java.lang.reflect.Method.invoke(Method.java:597)
                  at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
          Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.task.JobContextImpl
                  at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
                  at java.security.AccessController.doPrivileged(Native Method)
                  at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
                  at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
                  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
                  at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
                  ... 9 more
          ================================================================================
          Pig Stack Trace
          ---------------
          ERROR 2998: Unhandled internal error. Could not initialize class org.apache.pig.tools.pigstats.PigStatsUtil
          

          So I am unable to test if the new build works with Hadoop 23.

          This patch produces a new file, pig-core.jar, in the top level directory.
          pig-core.jar is just a copy of pig-0.11.0-SNAPSHOT.jar in build/.
          This is done just for convenience. Once we get rid of compatibility we can remove all the pig jars from the top level directory.

          Show
          Gianmarco De Francisci Morales added a comment - On trunk, if I compile with -Dhadoopversion=23 and I launch bin/pig -x local I get the following error repeated continuously: 2012-09-17 22:44:54,287 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Could not initialize class org.apache.pig.tools.pigstats.PigStatsUtil Details at logfile: /Users/gdfm/workspace/pig/git/pig_1347914693509.log Here the stack trace: Pig Stack Trace --------------- ERROR 2998: Unhandled internal error. org/apache/hadoop/mapreduce/task/JobContextImpl java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/task/JobContextImpl at org.apache.pig.tools.pigstats.PigStatsUtil.<clinit>(PigStatsUtil.java:54) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:67) at org.apache.pig.Main.run(Main.java:538) at org.apache.pig.Main.main(Main.java:154) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.task.JobContextImpl at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang. ClassLoader .loadClass( ClassLoader .java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang. ClassLoader .loadClass( ClassLoader .java:247) ... 9 more ================================================================================ Pig Stack Trace --------------- ERROR 2998: Unhandled internal error. Could not initialize class org.apache.pig.tools.pigstats.PigStatsUtil So I am unable to test if the new build works with Hadoop 23. This patch produces a new file, pig-core.jar, in the top level directory. pig-core.jar is just a copy of pig-0.11.0-SNAPSHOT.jar in build/. This is done just for convenience. Once we get rid of compatibility we can remove all the pig jars from the top level directory.
          Hide
          Cheolsoo Park added a comment -

          I updated the patch based to trunk and fixed a few minor issues. Here is what I did:

          • Re-based to trunk.
          • Incorporated PIG-2979 to get local mode working with hadoop 2.0.x.
          • Incorporated PIG-2885 to preserve it.
          • Removed duplicate code in the fileset definitions of included jars for w/ and w/o dependencies using patternset.
          • Included joda-time.jar in pig-withouthadoop.jar as well as in pig.jar. It should but wasn't included.
          • Fixed a typo in the buildJar macrodef: conf/pig-default.properties => src/pig-default.properties. This was causing e2e test to fail.

          Testing done:

          • ant jar -Dhadoopversion=[20|23]
          • ant test -Dhadoopversion=[20|23]
          • ant test-e2e-local -Dhadoopversion=[20|23]
          • ant test-e2e -Dhadoopversion=[20|23]
          • ant clean
          • ant docs

          I also confirmed that there is no diff of the following command between before and after:

          jar -tvf pig.jar | awk '{print $8}'
          

          Thanks!

          Show
          Cheolsoo Park added a comment - I updated the patch based to trunk and fixed a few minor issues. Here is what I did: Re-based to trunk. Incorporated PIG-2979 to get local mode working with hadoop 2.0.x. Incorporated PIG-2885 to preserve it. Removed duplicate code in the fileset definitions of included jars for w/ and w/o dependencies using patternset. Included joda-time.jar in pig-withouthadoop.jar as well as in pig.jar. It should but wasn't included. Fixed a typo in the buildJar macrodef: conf/pig-default.properties => src/pig-default.properties. This was causing e2e test to fail. Testing done: ant jar -Dhadoopversion= [20|23] ant test -Dhadoopversion= [20|23] ant test-e2e-local -Dhadoopversion= [20|23] ant test-e2e -Dhadoopversion= [20|23] ant clean ant docs I also confirmed that there is no diff of the following command between before and after: jar -tvf pig.jar | awk '{print $8}' Thanks!
          Hide
          Cheolsoo Park added a comment -

          I don't think that we want to backport this to branch-0.11, so I am un-linking from 0.11. Please let me know if anyone think otherwise. I will post a patch for 0.11.

          p.s. A typo correction in my previous comment. The command which I compared the diff of is:

          jar -tvf pig-withouthadoop.jar | awk '{print $8}'
          

          Pig.jar has extra contents because I incorporated PIG-2979.

          Show
          Cheolsoo Park added a comment - I don't think that we want to backport this to branch-0.11, so I am un-linking from 0.11. Please let me know if anyone think otherwise. I will post a patch for 0.11. p.s. A typo correction in my previous comment. The command which I compared the diff of is: jar -tvf pig-withouthadoop.jar | awk '{print $8}' Pig.jar has extra contents because I incorporated PIG-2979 .
          Hide
          Cheolsoo Park added a comment -

          Rebased to trunk again.

          Show
          Cheolsoo Park added a comment - Rebased to trunk again.
          Hide
          Gianmarco De Francisci Morales added a comment -

          Hi Cheolsoo,
          Would you mind rebasing it to trunk once again?
          It does not apply cleanly anymore.
          I will review it as soon as it is rebased.

          Show
          Gianmarco De Francisci Morales added a comment - Hi Cheolsoo, Would you mind rebasing it to trunk once again? It does not apply cleanly anymore. I will review it as soon as it is rebased.
          Hide
          Cheolsoo Park added a comment -

          Hi Gianmarco,

          I rebased the patch to trunk. I am attaching patches - with and without whitespace changes for your convenience.

          Thanks!

          Show
          Cheolsoo Park added a comment - Hi Gianmarco, I rebased the patch to trunk. I am attaching patches - with and without whitespace changes for your convenience. Thanks!
          Hide
          Gianmarco De Francisci Morales added a comment -

          +1
          It works for both Hadoop 20 and 23.

          Thanks for the work, Cheolsoo.

          Show
          Gianmarco De Francisci Morales added a comment - +1 It works for both Hadoop 20 and 23. Thanks for the work, Cheolsoo.
          Hide
          Cheolsoo Park added a comment -

          Hi Gianmarco,

          Thank you very much for the review.

          Unfortunately, I realized that I hadn't incorporated PIG-2748 where pig-core.jar is renamed to pig-<version>.jar. This makes ant mvn-jar fail, so I fixed the issue in a new patch.

          The changes include:

          • Removed the pig-core target and pig-core.jar from the top level.
          • Build pig-<version>.jar in the pig and pig-withouthadoop targets.

          I verified that ant jar-all builds the same jar files as before.

          Would mind doing a review one more time?

          Thanks!

          Show
          Cheolsoo Park added a comment - Hi Gianmarco, Thank you very much for the review. Unfortunately, I realized that I hadn't incorporated PIG-2748 where pig-core.jar is renamed to pig-<version>.jar . This makes ant mvn-jar fail, so I fixed the issue in a new patch. The changes include: Removed the pig-core target and pig-core.jar from the top level. Build pig-<version>.jar in the pig and pig-withouthadoop targets. I verified that ant jar-all builds the same jar files as before. Would mind doing a review one more time? Thanks!
          Hide
          Gianmarco De Francisci Morales added a comment -

          Hi Cheolsoo,

          Good catch, thanks. I wasn't familiar with PIG-2748.
          I verified that ant mvn-jar passes.
          Also ran the eclipse-files, src-release, tar-release targets and verified their output.

          +1 to the last patch

          Show
          Gianmarco De Francisci Morales added a comment - Hi Cheolsoo, Good catch, thanks. I wasn't familiar with PIG-2748 . I verified that ant mvn-jar passes. Also ran the eclipse-files , src-release , tar-release targets and verified their output. +1 to the last patch
          Hide
          Cheolsoo Park added a comment -

          Committed to trunk. Thanks Gianmarco!

          Show
          Cheolsoo Park added a comment - Committed to trunk. Thanks Gianmarco!

            People

            • Assignee:
              Gianmarco De Francisci Morales
              Reporter:
              Gianmarco De Francisci Morales
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development