Pig
  1. Pig
  2. PIG-1632

The core jar in the tarball contains the kitchen sink

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0, 0.9.0
    • Fix Version/s: site, 0.9.0
    • Component/s: build
    • Labels:
      None

      Description

      The core jar in the tarball contains the kitchen sink, it's not the same core jar built by ant jar. This is problematic since other projects that want to depend on the pig core jar just want pig core, but pig-0.8.0-SNAPSHOT-core.jar in the tarball contains a bunch of other stuff (hadoop, com.google, commons, etc) that may conflict with the packages also on a user's classpath.

      pig1 (trunk)$ jar tvf build/pig-0.8.0-SNAPSHOT-core.jar |grep -v pig|wc -l
      12
      pig1 (trunk)$ tar xvzf build/pig-0.8.0-SNAPSHOT.tar.gz
      ...
      pig1 (trunk)$ jar tvf pig-0.8.0-SNAPSHOT/pig-0.8.0-SNAPSHOT-core.jar |grep -v pig|wc -l
      4819
      

      How about restricting the core jar to just Pig classes?

      1. pig-1632-1.patch
        0.4 kB
        Eli Collins
      2. pig-1632-2.patch
        0.5 kB
        Eli Collins

        Activity

        Hide
        Olga Natkovich added a comment -

        patch committed to both 0.8 branch and trunk. Thanks, Eli for contributing!

        Show
        Olga Natkovich added a comment - patch committed to both 0.8 branch and trunk. Thanks, Eli for contributing!
        Hide
        Olga Natkovich added a comment -

        + 1, patch looks good. I will commit it to trunk and 0.8 branch shortly

        Show
        Olga Natkovich added a comment - + 1, patch looks good. I will commit it to trunk and 0.8 branch shortly
        Hide
        Eli Collins added a comment -

        Great. Patch attached. I verified the tarball produced by ant tar includes both a core jar that is just pig core and a pig jar that has everything.

        Show
        Eli Collins added a comment - Great. Patch attached. I verified the tarball produced by ant tar includes both a core jar that is just pig core and a pig jar that has everything.
        Hide
        Olga Natkovich added a comment -

        I am fine with your second proposal which is what I also suggested in my last comment. The first one makes it harder for the users to compile their UDFs

        Show
        Olga Natkovich added a comment - I am fine with your second proposal which is what I also suggested in my last comment. The first one makes it harder for the users to compile their UDFs
        Hide
        Eli Collins added a comment -

        Hey Olga,

        Thanks for the feedback. Agree that we want the out of box experience to use the same versions of other jars we've been testing with, but shouldn't that happen by bundling the necessary jars in eg the lib directory rather than embedding all the jars inside the core pig jar?

        If people want all the dependencies bundled into a single jar, how about I update the patch so the release has two jars: a pig.jar which is like the current one (has all the other jars bundled in) and a pig-core.jar which just has pig?

        Thanks,
        Eli

        Show
        Eli Collins added a comment - Hey Olga, Thanks for the feedback. Agree that we want the out of box experience to use the same versions of other jars we've been testing with, but shouldn't that happen by bundling the necessary jars in eg the lib directory rather than embedding all the jars inside the core pig jar? If people want all the dependencies bundled into a single jar, how about I update the patch so the release has two jars: a pig.jar which is like the current one (has all the other jars bundled in) and a pig-core.jar which just has pig? Thanks, Eli
        Hide
        Olga Natkovich added a comment -

        Hi Eli, thanks for the patch.

        I don't think this is the approach we want to take. I think we should publish just core pig jar in maven since users have a way to pull the dependencies. However, as part of our release package we should include bundled pig.jar so that it works for users out of the box and they get exactly the version we have been testing for. I am fine if additionally we include the core jar as well if we do not do this already.

        Show
        Olga Natkovich added a comment - Hi Eli, thanks for the patch. I don't think this is the approach we want to take. I think we should publish just core pig jar in maven since users have a way to pull the dependencies. However, as part of our release package we should include bundled pig.jar so that it works for users out of the box and they get exactly the version we have been testing for. I am fine if additionally we include the core jar as well if we do not do this already.
        Hide
        Eli Collins added a comment -

        Attached patch updates the package target so that the tarball, and therefore Pig release, just contain the Pig core jar. If a Pig release needs to bundle Hadoop and a bunch of other stuff perhaps we could put those jars in lib instead of the core jar.

        Running things like the tests out of a tarball that just includes the core jar works as these come in via ivy, anything else that needs to be tested?

        Show
        Eli Collins added a comment - Attached patch updates the package target so that the tarball, and therefore Pig release, just contain the Pig core jar. If a Pig release needs to bundle Hadoop and a bunch of other stuff perhaps we could put those jars in lib instead of the core jar. Running things like the tests out of a tarball that just includes the core jar works as these come in via ivy, anything else that needs to be tested?

          People

          • Assignee:
            Eli Collins
            Reporter:
            Eli Collins
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development