Avro
  1. Avro
  2. AVRO-791

Split avro-tools jar in 1.5.0 to one contains avro tool classes and one contains external dependencies

    Details

    • Type: Wish Wish
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.5.0
    • Fix Version/s: 1.5.1
    • Component/s: java
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently, avro-tools.jar for 1.5.0 contains both avro specific tools classes from org.apache.avro.tool and many external dependencies like org.slf4j.impl. The problem is that if i define my own logback impl for slf4j in my pom, it conflicts with the org.slf4j.impl.StaticLoggerBinder.class in avro-tools. It would be better to split it to two jars: one only contains avro specific tools, and another one only contains external dependencies

      1. AVRO-791.patch
        2 kB
        Scott Carey

        Issue Links

          Activity

          Hide
          Doug Cutting added a comment -

          I committed this. Thanks, Scott!

          Show
          Doug Cutting added a comment - I committed this. Thanks, Scott!
          Hide
          Doug Cutting added a comment -

          I'd like to commit this and roll a 1.5.1 release candidate. Objections?

          Show
          Doug Cutting added a comment - I'd like to commit this and roll a 1.5.1 release candidate. Objections?
          Hide
          Doug Cutting added a comment -

          +1 This works for me. Thanks!

          Show
          Doug Cutting added a comment - +1 This works for me. Thanks!
          Hide
          Scott Carey added a comment -

          Fixed patch, there was an error in the prior.

          Show
          Scott Carey added a comment - Fixed patch, there was an error in the prior.
          Hide
          Scott Carey added a comment -

          Patch adds avro-tools-VERSION-nodeps.jar.

          This patch also moves the 'sign' portion to a profile, so that users are not forced to sign in order to 'install' in normal development (add -P sign to do so).

          We could move that to another ticket, but signing was introduced without one anyhow.

          Show
          Scott Carey added a comment - Patch adds avro-tools-VERSION-nodeps.jar. This patch also moves the 'sign' portion to a profile, so that users are not forced to sign in order to 'install' in normal development (add -P sign to do so). We could move that to another ticket, but signing was introduced without one anyhow.
          Hide
          Doug Cutting added a comment -

          +1 for avro-tools-VERSION-nodeps.jar.

          Show
          Doug Cutting added a comment - +1 for avro-tools-VERSION-nodeps.jar.
          Hide
          Xiaolu Ye added a comment -

          To minimize the impact, a new jar with nodeps classifier sounds good me. Thanks!

          Show
          Xiaolu Ye added a comment - To minimize the impact, a new jar with nodeps classifier sounds good me. Thanks!
          Hide
          Scott Carey added a comment -

          How about avro-tools-VERSION-nodeps.jar, with a maven classifier of 'nodeps'? That should be pretty easy to do with a minor pom change, I'll have a look at it in a few days.

          The thrift-protobuf-compare benchmark requires it, and others call the tools classes directly instead of by command-line.

          Off-topic FYI RE benchmarks:
          There are some changes to the benchmark code and Avro that can speed up the thrift-protobuf-compare stuff. If Utf8 is 'more lazy' and doesn't create a byte array initially when constructed from a String, that helps. Replacing some of the
          for (type thing : things) loops in the harness code with
          for (int counter; condition; increment) loops helps some. The harness overhead is high, time is mostly in String to Utf8 conversions, GenericDatumReader/Writer, and the harness – almost nothing is in the low level serialization/deserialization. Because of the requirement to extract their object from our IndexedRecords, the Reflect API is probably fastest here – we can avoid the extra copy step and produce the exact object type.

          Show
          Scott Carey added a comment - How about avro-tools-VERSION-nodeps.jar, with a maven classifier of 'nodeps'? That should be pretty easy to do with a minor pom change, I'll have a look at it in a few days. The thrift-protobuf-compare benchmark requires it, and others call the tools classes directly instead of by command-line. Off-topic FYI RE benchmarks: There are some changes to the benchmark code and Avro that can speed up the thrift-protobuf-compare stuff. If Utf8 is 'more lazy' and doesn't create a byte array initially when constructed from a String, that helps. Replacing some of the for (type thing : things) loops in the harness code with for (int counter; condition; increment) loops helps some. The harness overhead is high, time is mostly in String to Utf8 conversions, GenericDatumReader/Writer, and the harness – almost nothing is in the low level serialization/deserialization. Because of the requirement to extract their object from our IndexedRecords, the Reflect API is probably fastest here – we can avoid the extra copy step and produce the exact object type.
          Hide
          Doug Cutting added a comment -

          It would be good to fix this in 1.5.1.

          Show
          Doug Cutting added a comment - It would be good to fix this in 1.5.1.
          Hide
          Doug Cutting added a comment -

          This is related to AVRO-758.

          Show
          Doug Cutting added a comment - This is related to AVRO-758 .
          Hide
          Doug Cutting added a comment -

          I don't think we should change the contents or name of avro-tools.jar. It's convenient to have a single jar that works as a command-line executable.

          But I agree that we should add a new jar that contains just the tool implementations. I needed this recently when updating the java-serialization benchmark code to use Avro 1.5.0. Perhaps the new jar could be called avro-tool.jar?

          Show
          Doug Cutting added a comment - I don't think we should change the contents or name of avro-tools.jar. It's convenient to have a single jar that works as a command-line executable. But I agree that we should add a new jar that contains just the tool implementations. I needed this recently when updating the java-serialization benchmark code to use Avro 1.5.0. Perhaps the new jar could be called avro-tool.jar?

            People

            • Assignee:
              Scott Carey
              Reporter:
              Xiaolu Ye
            • Votes:
              1 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development