HBase
  1. HBase
  2. HBASE-2128

ant tar build broken since switch to Ivy

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.90.0
    • Fix Version/s: 0.90.0
    • Component/s: build
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Running ant tar produces a very small tar file because all .jar dependencies are missing. This happens since the switch to Ivy.

      Adding common.ivy.lib.dir to the build.xml fixes some of it but some things still don't work:

          <mkdir dir="${dist.dir}/lib"/>
          <copy todir="${dist.dir}/lib">
            <fileset dir="${build.lib}" />
            <fileset dir="${common.ivy.lib.dir}"/>
          </copy>
      

      The jars for the contrib apps still seem to be missing. At the moment this is only stargate but the I've got the same problem for the new thrift contrib. I am afraid I don't know enough about Ant or Ivy to be of any further assistance.

      1. HBASE-2128.patch
        1 kB
        Karthik K
      2. HBASE-2128.patch
        2 kB
        Karthik K
      3. HBASE-2128.patch
        1 kB
        Karthik K
      4. HBASE-2128.patch
        1 kB
        Karthik K

        Activity

        Hide
        Karthik K added a comment -

        Will try to look further.

        But the tarball for a release is expected to be smaller after HBASE-1433, because the lib*.jars are not supposed to be released , but will be retrieved on demand.

        zk and thrift are retained only because their artifacts have not yet been published.

        Show
        Karthik K added a comment - Will try to look further. But the tarball for a release is expected to be smaller after HBASE-1433 , because the lib*.jars are not supposed to be released , but will be retrieved on demand. zk and thrift are retained only because their artifacts have not yet been published.
        Hide
        ryan rawson added a comment -

        Tar is supposed to be self contained because it is the basis of our packaging and release. We are not going to force our users to download all the deps on their prod machines.

        Show
        ryan rawson added a comment - Tar is supposed to be self contained because it is the basis of our packaging and release. We are not going to force our users to download all the deps on their prod machines.
        Hide
        Karthik K added a comment -

        ) tar now maintains 3 different sets - lib/core/.jar , lib/transactional/.jar , lib/stargate/.jar

        Thrift + zookeeper , that are common to all of them is in lib only.

        ) Bug in init target due to space in libthrift , not being copied

        ) unnecessary classpath element in javadoc target removed since no such file exists ( contrib//.jar) in the tree anymore.

        Having said that - the scripts need to be modified (or we need to identify a file-layout structure) since the scripts are meant for build/ivy/common - build/contrib/stargate/ivy/lib/common etc. , for a development environment.

        When we release - may be we need to revisit what structure we would want and refactor scripts accordingly.

        Show
        Karthik K added a comment - ) tar now maintains 3 different sets - lib/core/ .jar , lib/transactional/ .jar , lib/stargate/ .jar Thrift + zookeeper , that are common to all of them is in lib only. ) Bug in init target due to space in libthrift , not being copied ) unnecessary classpath element in javadoc target removed since no such file exists ( contrib/ / .jar) in the tree anymore. Having said that - the scripts need to be modified (or we need to identify a file-layout structure) since the scripts are meant for build/ivy/common - build/contrib/stargate/ivy/lib/common etc. , for a development environment. When we release - may be we need to revisit what structure we would want and refactor scripts accordingly.
        Hide
        Karthik K added a comment -

        next version of the same

        In addition to the previous ,

        • added lib/core/*.jar to CP in the hbase script as well- that should work with the release tarball.

        The CP entries are -

        lib/(thrift/zookeeper)
        lib/core/*.jar

        HBase team - feel free to swap the order if that does not sound right.

        Meanwhile - the CP jars of stargate / transactional are kept in a separate namespace altogether - not affecting the scripts.

        As the contrib packages increase - it would be useful to keep core and the launching script in a separate namespace .

        Show
        Karthik K added a comment - next version of the same In addition to the previous , added lib/core/*.jar to CP in the hbase script as well- that should work with the release tarball. The CP entries are - lib/(thrift/zookeeper) lib/core/*.jar HBase team - feel free to swap the order if that does not sound right. Meanwhile - the CP jars of stargate / transactional are kept in a separate namespace altogether - not affecting the scripts. As the contrib packages increase - it would be useful to keep core and the launching script in a separate namespace .
        Hide
        Lars Francke added a comment -

        Meanwhile - the CP jars of stargate / transactional are kept in a separate namespace altogether - not affecting the scripts.

        I don't know if you are doing this intentionally but now the core jars are duplicated in the lib/* folders. This not only doubles the size of the final tar (from 35 MB to 70 MB) but it seems to cause problems with SLF4J (which will have to be introduced as a new dependency for Thrift 0.2). The latter may be my fault though, I'm struggling to get everything running as it was before the switch to Ivy. I'll comment again if it turns out the problem was caused by me.

        Either way the duplication should be unnecessary. At least I can't imagine why it would be required.

        The src/contrib/stargate/lib folder is empty and can be deleted/does not need to be created at all.

        Thanks for your quick fix!

        Show
        Lars Francke added a comment - Meanwhile - the CP jars of stargate / transactional are kept in a separate namespace altogether - not affecting the scripts. I don't know if you are doing this intentionally but now the core jars are duplicated in the lib/* folders. This not only doubles the size of the final tar (from 35 MB to 70 MB) but it seems to cause problems with SLF4J (which will have to be introduced as a new dependency for Thrift 0.2). The latter may be my fault though, I'm struggling to get everything running as it was before the switch to Ivy. I'll comment again if it turns out the problem was caused by me. Either way the duplication should be unnecessary. At least I can't imagine why it would be required. The src/contrib/stargate/lib folder is empty and can be deleted/does not need to be created at all. Thanks for your quick fix!
        Hide
        Karthik K added a comment -

        I don't know if you are doing this intentionally but now the core jars are duplicated in the lib/* folders.

        As I mentioned earlier - it was intentional to separate the CP namespace for core / transactional / .. <other contribs> .

        If that is too much of a complexity then - modify the tar patch such that all the target files - dependency jars , from ivy, get copied to lib/*.jar directory.

        The src/contrib/stargate/lib folder is empty and can be deleted/does not need to be created at all.

        true. thanks.

        I'm struggling to get everything running as it was before the switch to Ivy. I'll comment again if it turns out the problem was caused by me.

        Feel free to log tickets / add more details to the broken parts so that we can fix the same.

        Show
        Karthik K added a comment - I don't know if you are doing this intentionally but now the core jars are duplicated in the lib/* folders. As I mentioned earlier - it was intentional to separate the CP namespace for core / transactional / .. <other contribs> . If that is too much of a complexity then - modify the tar patch such that all the target files - dependency jars , from ivy, get copied to lib/*.jar directory. The src/contrib/stargate/lib folder is empty and can be deleted/does not need to be created at all. true. thanks. I'm struggling to get everything running as it was before the switch to Ivy. I'll comment again if it turns out the problem was caused by me. Feel free to log tickets / add more details to the broken parts so that we can fix the same.
        Hide
        Andrew Purtell added a comment -

        The src/contrib/stargate/lib folder is empty and can be deleted/does not need to be created at all.

        I found if it was not there, then ivy would error out.

        Show
        Andrew Purtell added a comment - The src/contrib/stargate/lib folder is empty and can be deleted/does not need to be created at all. I found if it was not there, then ivy would error out.
        Hide
        stack added a comment -

        Maybe we need to add a mkdir somewhere? mkdir after copy of stargate to build for compile?

        Show
        stack added a comment - Maybe we need to add a mkdir somewhere? mkdir after copy of stargate to build for compile?
        Hide
        Karthik K added a comment -

        So - what is the consensus we have regarding the directory structure of release , as far as lib directory is concerned ?

        If we want to have all the jars in lib/*.jar as before - I am ok , but keep in mind as we add more contribs we may want to separate the CP namespaces for them to make thing easy for debugging.

        The src/contrib/stargate/lib folder is empty and can be deleted/does not need to be created at all.
        Maybe we need to add a mkdir somewhere? mkdir after copy of stargate to build for compile?

        I am ok with one way or another , but if it is distracting at this point - ok with maintaining current status quo while removing any CP reference in build.xml files to contrib/*/lib/.jar , since no such file exists / should be there in future.

        Show
        Karthik K added a comment - So - what is the consensus we have regarding the directory structure of release , as far as lib directory is concerned ? If we want to have all the jars in lib/*.jar as before - I am ok , but keep in mind as we add more contribs we may want to separate the CP namespaces for them to make thing easy for debugging. The src/contrib/stargate/lib folder is empty and can be deleted/does not need to be created at all. Maybe we need to add a mkdir somewhere? mkdir after copy of stargate to build for compile? I am ok with one way or another , but if it is distracting at this point - ok with maintaining current status quo while removing any CP reference in build.xml files to contrib/* /lib/ .jar , since no such file exists / should be there in future.
        Hide
        Karthik K added a comment -

        Patch that gets all jars in pre-Ivy stage - ( all jars in lib/*.jar ) .

        Show
        Karthik K added a comment - Patch that gets all jars in pre-Ivy stage - ( all jars in lib/*.jar ) .
        Hide
        Lars Francke added a comment -

        My preferred way of doing this would be to keep only the extra dependencies for the contribs in their own folders.

        So all core dependencies in lib/core all extra dependencies for stargate in lib/stargate but not the core dependencies. If I remember correctly this was the way it was done before. At least almost. If one decides to use stargate all you need to do is to add one directory to the classpath.

        In all likelihood I'm missing something important here but I don't see why all jars would have to be duplicated. For 0.21 we'll probably have three contribs with dependencies (I'm not counting ec2): stargate, transactional and thrift. This would mean that all common jars would be included four times in the final tar.

        All dependencies in one folder doesn't seem right either.

        But I'm still very new to this project so feel free to disregard all I've said

        Show
        Lars Francke added a comment - My preferred way of doing this would be to keep only the extra dependencies for the contribs in their own folders. So all core dependencies in lib/core all extra dependencies for stargate in lib/stargate but not the core dependencies. If I remember correctly this was the way it was done before. At least almost. If one decides to use stargate all you need to do is to add one directory to the classpath. In all likelihood I'm missing something important here but I don't see why all jars would have to be duplicated. For 0.21 we'll probably have three contribs with dependencies (I'm not counting ec2): stargate, transactional and thrift. This would mean that all common jars would be included four times in the final tar. All dependencies in one folder doesn't seem right either. But I'm still very new to this project so feel free to disregard all I've said
        Hide
        Karthik K added a comment -

        My preferred way of doing this would be to keep only the extra dependencies for the contribs in their own folders.

        That would be the ideal thing I guess. That would mean - revisiting build.xml / build-contrib.xml to reuse the files, to create only the 'diff' dependencies. Given that 'tar' is broken - I am trying to look for something quick and immediate.

        In all likelihood I'm missing something important here but I don't see why all jars would have to be duplicated

        Agree - that was just meant to have contrib-s as separated namespace. If the previous issue were resolved - then having core and contribs separate would be ideal.

        All dependencies in one folder doesn't seem right either.

        Yup. that was my main concern.

        So - did you look at the most recent patch ( all in one lib/*.jar) minimally intrusive to the scripts - but need to be revisited in the long term though.

        Show
        Karthik K added a comment - My preferred way of doing this would be to keep only the extra dependencies for the contribs in their own folders. That would be the ideal thing I guess. That would mean - revisiting build.xml / build-contrib.xml to reuse the files, to create only the 'diff' dependencies. Given that 'tar' is broken - I am trying to look for something quick and immediate. In all likelihood I'm missing something important here but I don't see why all jars would have to be duplicated Agree - that was just meant to have contrib-s as separated namespace. If the previous issue were resolved - then having core and contribs separate would be ideal. All dependencies in one folder doesn't seem right either. Yup. that was my main concern. So - did you look at the most recent patch ( all in one lib/*.jar) minimally intrusive to the scripts - but need to be revisited in the long term though.
        Hide
        ryan rawson added a comment -

        the contrib method in general needs to be reworked, i think its way out of scope here, so if we can do more or less what we were doing before, then i say it is fixed. I'm not sure how important it is to tease out the Classpaths, my initial thought it 'doesnt matter'.

        Show
        ryan rawson added a comment - the contrib method in general needs to be reworked, i think its way out of scope here, so if we can do more or less what we were doing before, then i say it is fixed. I'm not sure how important it is to tease out the Classpaths, my initial thought it 'doesnt matter'.
        Hide
        Karthik K added a comment -

        Do one at a time. Do not deal with javadoc issue here. See HBASE-2135 for javadoc cp issue

        Show
        Karthik K added a comment - Do one at a time. Do not deal with javadoc issue here. See HBASE-2135 for javadoc cp issue
        Hide
        Andrew Purtell added a comment -

        For 0.21 we'll probably have three contribs with dependencies (I'm not counting ec2): stargate, transactional and thrift.

        Don't count out EC2. At some point the bash stuff will be deprecated and there will be a Python/libcloud based approach replacing it, like what Hadoop Core is doing. So the EC2 stuff is likely to pull dependencies.

        Show
        Andrew Purtell added a comment - For 0.21 we'll probably have three contribs with dependencies (I'm not counting ec2): stargate, transactional and thrift. Don't count out EC2. At some point the bash stuff will be deprecated and there will be a Python/libcloud based approach replacing it, like what Hadoop Core is doing. So the EC2 stuff is likely to pull dependencies.
        Hide
        ryan rawson added a comment -

        i think it would be better if ec2 wasnt part of the core build stuff. seems kind of weird...

        Show
        ryan rawson added a comment - i think it would be better if ec2 wasnt part of the core build stuff. seems kind of weird...
        Hide
        ryan rawson added a comment -

        while we are at it, we should turn stdout/stderr output buffering, since during an ivy build the 'download' does not actually print increment progress during 1 line.

        Show
        ryan rawson added a comment - while we are at it, we should turn stdout/stderr output buffering, since during an ivy build the 'download' does not actually print increment progress during 1 line.
        Hide
        Andrew Purtell added a comment -

        i think it would be better if ec2 wasnt part of the core build stuff. seems kind of weird...

        It's not. It's a contrib – src/contrib/ec2/ ... Or did I miss the point?

        Show
        Andrew Purtell added a comment - i think it would be better if ec2 wasnt part of the core build stuff. seems kind of weird... It's not. It's a contrib – src/contrib/ec2/ ... Or did I miss the point?
        Hide
        Karthik K added a comment -

        while we are at it, we should turn stdout/stderr output buffering, since during an ivy build the 'download' does not actually print increment progress during 1 line.

        makes sense. would it be ok to take it in a separate issue though ?

        Show
        Karthik K added a comment - while we are at it, we should turn stdout/stderr output buffering, since during an ivy build the 'download' does not actually print increment progress during 1 line. makes sense. would it be ok to take it in a separate issue though ?
        Hide
        Lars Francke added a comment -

        Kay Kay's latest patch seems to work perfect in regards to the broken ant tar build. Is it ready to be committed?

        Show
        Lars Francke added a comment - Kay Kay's latest patch seems to work perfect in regards to the broken ant tar build. Is it ready to be committed?
        Hide
        stack added a comment -

        I should commit this?

        Show
        stack added a comment - I should commit this?
        Hide
        ryan rawson added a comment -

        stack: do it, make it happen.

        Show
        ryan rawson added a comment - stack: do it, make it happen.
        Hide
        stack added a comment -

        Applied to TRUNK. Thanks for the patch Kay Kay.

        Show
        stack added a comment - Applied to TRUNK. Thanks for the patch Kay Kay.
        Hide
        Karthik K added a comment -

        Thanks for taking this patch. When I submitted this patch - there were only 2 contribs in this - that I had hardcoded ( stargate / transactional ). But with the addition of mdc_replication now, an entry might need to be added for that as well. I can track it separately.

        Show
        Karthik K added a comment - Thanks for taking this patch. When I submitted this patch - there were only 2 contribs in this - that I had hardcoded ( stargate / transactional ). But with the addition of mdc_replication now, an entry might need to be added for that as well. I can track it separately.

          People

          • Assignee:
            Karthik K
            Reporter:
            Lars Francke
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development