ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-233

Create a slimmer jar for clients to reduce their disk footprint.

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Trivial Trivial
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 3.5.0
    • Component/s: build, java client
    • Labels:
      None
    • Release Note:
      not a blocker for 3.2, moving to 3.3

      Description

      Patrick request I open up this in issue in this email thread

        Issue Links

          Activity

          Hide
          Patrick Hunt added a comment -

          +1 - update build.xml to generate a new zookeeper-client jar containing only the zk java files needed for a client (dont' modify the content of the existing zookeeper jar).

          Show
          Patrick Hunt added a comment - +1 - update build.xml to generate a new zookeeper-client jar containing only the zk java files needed for a client (dont' modify the content of the existing zookeeper jar).
          Hide
          Doug Cutting added a comment -

          Why not generate two disjoint jar files, one that requires the other?

          Show
          Doug Cutting added a comment - Why not generate two disjoint jar files, one that requires the other?
          Hide
          Patrick Hunt added a comment -

          I had 2 things in mind:

          1) b/w compatibility
          2) simplicity (include 1 jar on the classpath rather than 2)

          other than not duplicating classes in two jars what are the benefits of disjoint?

          Show
          Patrick Hunt added a comment - I had 2 things in mind: 1) b/w compatibility 2) simplicity (include 1 jar on the classpath rather than 2) other than not duplicating classes in two jars what are the benefits of disjoint?
          Hide
          Doug Cutting added a comment -

          I think the more common thing for projects that generate multiple jars is for them to be disjoint, so it might be less confusing if they're disjoint.

          Show
          Doug Cutting added a comment - I think the more common thing for projects that generate multiple jars is for them to be disjoint, so it might be less confusing if they're disjoint.
          Hide
          Patrick Hunt added a comment -

          @Doug: that's how I usually do it.

          I'm assuming we are talking about:
          zookeeper-client-<version>.jar - all classes necessary to run client, incl generated marshalling code
          zookeeper-server-<version>.jar - all of the classes that are part of ZK, except for that in zookeeper-client jar

          This would be a non-bwcompat change – ie build, scripts (zkServer.sh/zkClient.sh), release packaging, etc.. would have to change.

          or should we have three - client, server, common (marshalling code, etc...)? In case we find more reasons to create additional "specialized" jars.

          We have 1 vote from doug to have disjoint, anyone else?

          Show
          Patrick Hunt added a comment - @Doug: that's how I usually do it. I'm assuming we are talking about: zookeeper-client-<version>.jar - all classes necessary to run client, incl generated marshalling code zookeeper-server-<version>.jar - all of the classes that are part of ZK, except for that in zookeeper-client jar This would be a non-bwcompat change – ie build, scripts (zkServer.sh/zkClient.sh), release packaging, etc.. would have to change. or should we have three - client, server, common (marshalling code, etc...)? In case we find more reasons to create additional "specialized" jars. We have 1 vote from doug to have disjoint, anyone else?
          Hide
          Mahadev konar added a comment -

          i dont htinnk the marshalling code should be in the client.jar. The whole idea of having seperate jars I thought was to run them without being dependent on each other. The one option would be to have 3 jars with the seriliazation/deser code in another jar? This isnt a great idea. Too many jars is a hassle. So, I think we should just include the common ser/deser code in both the client and server for now? and have 2 jars but with some common files?

          Show
          Mahadev konar added a comment - i dont htinnk the marshalling code should be in the client.jar. The whole idea of having seperate jars I thought was to run them without being dependent on each other. The one option would be to have 3 jars with the seriliazation/deser code in another jar? This isnt a great idea. Too many jars is a hassle. So, I think we should just include the common ser/deser code in both the client and server for now? and have 2 jars but with some common files?
          Hide
          Patrick Hunt added a comment -

          If we are going to change this we should go with 3 jars - common, server, and client. If someone can submit a patch for 3.1.0 that will help get it in sooner than later. Be sure to update the release notes with a section detailing this change (ie that 3rd party run script will have change the classpath)

          Show
          Patrick Hunt added a comment - If we are going to change this we should go with 3 jars - common, server, and client. If someone can submit a patch for 3.1.0 that will help get it in sooner than later. Be sure to update the release notes with a section detailing this change (ie that 3rd party run script will have change the classpath)
          Hide
          Mahadev konar added a comment -

          pushing it to 3.2

          Show
          Mahadev konar added a comment - pushing it to 3.2
          Hide
          Mahadev konar added a comment -

          not a blocker. Moving it out of 3.4 release.

          Show
          Mahadev konar added a comment - not a blocker. Moving it out of 3.4 release.
          Hide
          Thomas Koch added a comment -

          Copying the description of ZOOKEEPER-1003 from Jean-Pierre König (jpkoenig) here:

          This feature request applies to ZooKeeper, HBase, Hadoop and maybe other
          projects. Currently, to use one of these projects, I need to include one big
          jar file as a dependency, that

          contains the complete server code,
          contains much more code then I use
          and most annoyingly depends on many other jars, that are mostly needed for the
          server but not for the client library.

          Thus when using maven and including any of the mentioned projects, the
          dependency graph of my projects grows unnecessarily large.

          This is a severe problem for at least two reasons:

          The probability of conflicting dependencies (versions) gets boosted.
          Especially for mapreduce jobs depending on HBase or Zookeeper, the jars sent to the
          clusters grow to beyond 20-30MB of unnecessary dependencies.

          One could work around the problem with maven dependency exclusions, but this may lead to unpredictable runtime errors (ClassNotFound) since dependency management is not save on compile time only.

          I wish we could solve the underlying issue at the root with a client library.

          Show
          Thomas Koch added a comment - Copying the description of ZOOKEEPER-1003 from Jean-Pierre König (jpkoenig) here: This feature request applies to ZooKeeper, HBase, Hadoop and maybe other projects. Currently, to use one of these projects, I need to include one big jar file as a dependency, that contains the complete server code, contains much more code then I use and most annoyingly depends on many other jars, that are mostly needed for the server but not for the client library. Thus when using maven and including any of the mentioned projects, the dependency graph of my projects grows unnecessarily large. This is a severe problem for at least two reasons: The probability of conflicting dependencies (versions) gets boosted. Especially for mapreduce jobs depending on HBase or Zookeeper, the jars sent to the clusters grow to beyond 20-30MB of unnecessary dependencies. One could work around the problem with maven dependency exclusions, but this may lead to unpredictable runtime errors (ClassNotFound) since dependency management is not save on compile time only. I wish we could solve the underlying issue at the root with a client library.
          Hide
          Patrick Hunt added a comment -

          I've done some of this in the mavenization patch ZOOKEEPER-1078 - we are not including jute etc... also the maven-dist jars (since 3.3?) no longer include the source (separate jar). The mavenization patch will continue this.

          Show
          Patrick Hunt added a comment - I've done some of this in the mavenization patch ZOOKEEPER-1078 - we are not including jute etc... also the maven-dist jars (since 3.3?) no longer include the source (separate jar). The mavenization patch will continue this.

            People

            • Assignee:
              Unassigned
              Reporter:
              Hiram Chirino
            • Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:

                Development