Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-151

Cut down on the size of the tools jar by excluding more

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0
    • Component/s: None
    • Labels:
      None

      Description

      It would be good to cut down the size of the tools jar by excluding more of the transitive dependencies, especially through the hadoop jars.

        Issue Links

          Activity

          Hide
          istvan Istvan Szukacs added a comment -

          I was thking about splitting orc-core to orc-hadoop and orc-standalone and have different implementation for the reader and writer where the path and configuration are implemented differently.

          Hadoop in orc-core:

          ./pom.xml:      <groupId>org.apache.hadoop</groupId>
          ./pom.xml:      <artifactId>hadoop-common</artifactId>
          ./src/java/org/apache/orc/impl/ColumnStatisticsImpl.java:import org.apache.hadoop.io.BytesWritable;
          ./src/java/org/apache/orc/impl/ColumnStatisticsImpl.java:import org.apache.hadoop.io.Text;
          ./src/java/org/apache/orc/impl/ColumnStatisticsImpl.java:import org.apache.hadoop.io.WritableComparator;
          ./src/java/org/apache/orc/impl/DataReaderProperties.java:import org.apache.hadoop.fs.FileSystem;
          ./src/java/org/apache/orc/impl/DataReaderProperties.java:import org.apache.hadoop.fs.Path;
          ./src/java/org/apache/orc/impl/DynamicByteArray.java:import org.apache.hadoop.io.Text;
          ./src/java/org/apache/orc/impl/HadoopShims.java:import org.apache.hadoop.fs.FSDataInputStream;
          ./src/java/org/apache/orc/impl/HadoopShims.java:import org.apache.hadoop.io.Text;
          ./src/java/org/apache/orc/impl/HadoopShims.java:import org.apache.hadoop.util.VersionInfo;
          ./src/java/org/apache/orc/impl/HadoopShims.java:public interface HadoopShims {
          ./src/java/org/apache/orc/impl/HadoopShims.java:   * a hadoop.io ByteBufferPool shim.
          ./src/java/org/apache/orc/impl/HadoopShims.java:    private static HadoopShims SHIMS = null;
          ./src/java/org/apache/orc/impl/HadoopShims.java:    public static synchronized HadoopShims get() {
          ./src/java/org/apache/orc/impl/HadoopShims.java:          SHIMS = new HadoopShims_2_2();
          ./src/java/org/apache/orc/impl/HadoopShims.java:          SHIMS = new HadoopShimsCurrent();
          ./src/java/org/apache/orc/impl/HadoopShims_2_2.java:import org.apache.hadoop.fs.FSDataInputStream;
          ./src/java/org/apache/orc/impl/HadoopShims_2_2.java:import org.apache.hadoop.io.Text;
          ./src/java/org/apache/orc/impl/HadoopShims_2_2.java: * Shims for versions of Hadoop up to and including 2.2.x
          ./src/java/org/apache/orc/impl/HadoopShims_2_2.java:public class HadoopShims_2_2 implements HadoopShims {
          ./src/java/org/apache/orc/impl/HadoopShims_2_2.java:  HadoopShims_2_2() {
          ./src/java/org/apache/orc/impl/HadoopShims_2_2.java:      Class.forName("org.apache.hadoop.fs.CacheFlag", false,
          ./src/java/org/apache/orc/impl/HadoopShims_2_2.java:        HadoopShims_2_2.class.getClassLoader());
          ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java:import org.apache.hadoop.fs.FSDataInputStream;
          ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java:import org.apache.hadoop.io.Text;
          ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java:import org.apache.hadoop.io.compress.snappy.SnappyDecompressor;
          ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java:import org.apache.hadoop.io.compress.zlib.ZlibDecompressor;
          ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java: * Shims for recent versions of Hadoop
          ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java:public class HadoopShimsCurrent implements HadoopShims {
          ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java:    private final org.apache.hadoop.io.compress.DirectDecompressor root;
          ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java:    DirectDecompressWrapper(org.apache.hadoop.io.compress.DirectDecompressor root) {
          ./src/java/org/apache/orc/impl/MemoryManager.java:import org.apache.hadoop.conf.Configuration;
          ./src/java/org/apache/orc/impl/MemoryManagerImpl.java:import org.apache.hadoop.conf.Configuration;
          ./src/java/org/apache/orc/impl/MemoryManagerImpl.java:import org.apache.hadoop.fs.Path;
          ./src/java/org/apache/orc/impl/OrcAcidUtils.java:import org.apache.hadoop.fs.FSDataInputStream;
          ./src/java/org/apache/orc/impl/OrcAcidUtils.java:import org.apache.hadoop.fs.FileSystem;
          ./src/java/org/apache/orc/impl/OrcAcidUtils.java:import org.apache.hadoop.fs.Path;
          ./src/java/org/apache/orc/impl/PhysicalFsWriter.java:import org.apache.hadoop.fs.FSDataOutputStream;
          ./src/java/org/apache/orc/impl/PhysicalFsWriter.java:import org.apache.hadoop.fs.FileSystem;
          ./src/java/org/apache/orc/impl/PhysicalFsWriter.java:import org.apache.hadoop.fs.Path;
          ./src/java/org/apache/orc/impl/ReaderImpl.java:import org.apache.hadoop.fs.FileStatus;
          ./src/java/org/apache/orc/impl/ReaderImpl.java:import org.apache.hadoop.conf.Configuration;
          ./src/java/org/apache/orc/impl/ReaderImpl.java:import org.apache.hadoop.fs.FSDataInputStream;
          ./src/java/org/apache/orc/impl/ReaderImpl.java:import org.apache.hadoop.fs.FileSystem;
          ./src/java/org/apache/orc/impl/ReaderImpl.java:import org.apache.hadoop.fs.Path;
          ./src/java/org/apache/orc/impl/ReaderImpl.java:import org.apache.hadoop.io.Text;
          ./src/java/org/apache/orc/impl/RecordReaderImpl.java:import org.apache.hadoop.fs.Path;
          ./src/java/org/apache/orc/impl/RecordReaderImpl.java:import org.apache.hadoop.io.Text;
          ./src/java/org/apache/orc/impl/RecordReaderUtils.java:import org.apache.hadoop.fs.FSDataInputStream;
          ./src/java/org/apache/orc/impl/RecordReaderUtils.java:import org.apache.hadoop.fs.FileSystem;
          ./src/java/org/apache/orc/impl/RecordReaderUtils.java:import org.apache.hadoop.fs.Path;
          ./src/java/org/apache/orc/impl/RecordReaderUtils.java:  private static final HadoopShims SHIMS = HadoopShims.Factory.get();
          ./src/java/org/apache/orc/impl/RecordReaderUtils.java:    private HadoopShims.ZeroCopyReaderShim zcr = null;
          ./src/java/org/apache/orc/impl/RecordReaderUtils.java:      try (HadoopShims.ZeroCopyReaderShim myZcr = zcr) {
          ./src/java/org/apache/orc/impl/RecordReaderUtils.java:                                      HadoopShims.ZeroCopyReaderShim zcr,
          ./src/java/org/apache/orc/impl/RecordReaderUtils.java:  static HadoopShims.ZeroCopyReaderShim createZeroCopyShim(FSDataInputStream file,
          ./src/java/org/apache/orc/impl/RecordReaderUtils.java:  // this is an implementation copied from ElasticByteBufferPool in hadoop-2,
          ./src/java/org/apache/orc/impl/RecordReaderUtils.java:  public final static class ByteBufferAllocatorPool implements HadoopShims.ByteBufferPoolShim {
          ./src/java/org/apache/orc/impl/SchemaEvolution.java:import org.apache.hadoop.conf.Configuration;
          ./src/java/org/apache/orc/impl/SnappyCodec.java:  private static final HadoopShims SHIMS = HadoopShims.Factory.get();
          ./src/java/org/apache/orc/impl/SnappyCodec.java:            HadoopShims.DirectCompressionType.SNAPPY) != null) {
          ./src/java/org/apache/orc/impl/SnappyCodec.java:    HadoopShims.DirectDecompressor decompressShim =
          ./src/java/org/apache/orc/impl/SnappyCodec.java:        SHIMS.getDirectDecompressor(HadoopShims.DirectCompressionType.SNAPPY);
          ./src/java/org/apache/orc/impl/StringRedBlackTree.java:import org.apache.hadoop.io.Text;
          ./src/java/org/apache/orc/impl/TreeReaderFactory.java:    private static final HadoopShims SHIMS = HadoopShims.Factory.get();
          ./src/java/org/apache/orc/impl/WriterImpl.java:import org.apache.hadoop.conf.Configuration;
          ./src/java/org/apache/orc/impl/WriterImpl.java:import org.apache.hadoop.fs.FileSystem;
          ./src/java/org/apache/orc/impl/WriterImpl.java:import org.apache.hadoop.fs.Path;
          ./src/java/org/apache/orc/impl/WriterImpl.java:import org.apache.hadoop.io.Text;
          ./src/java/org/apache/orc/impl/ZeroCopyShims.java:import org.apache.hadoop.fs.FSDataInputStream;
          ./src/java/org/apache/orc/impl/ZeroCopyShims.java:import org.apache.hadoop.fs.ReadOption;
          ./src/java/org/apache/orc/impl/ZeroCopyShims.java:import org.apache.hadoop.io.ByteBufferPool;
          ./src/java/org/apache/orc/impl/ZeroCopyShims.java:    private HadoopShims.ByteBufferPoolShim pool;
          ./src/java/org/apache/orc/impl/ZeroCopyShims.java:    public ByteBufferPoolAdapter(HadoopShims.ByteBufferPoolShim pool) {
          ./src/java/org/apache/orc/impl/ZeroCopyShims.java:  private static final class ZeroCopyAdapter implements HadoopShims.ZeroCopyReaderShim {
          ./src/java/org/apache/orc/impl/ZeroCopyShims.java:                           HadoopShims.ByteBufferPoolShim poolshim) {
          ./src/java/org/apache/orc/impl/ZeroCopyShims.java:  public static HadoopShims.ZeroCopyReaderShim getZeroCopyReader(FSDataInputStream in,
          ./src/java/org/apache/orc/impl/ZeroCopyShims.java:                                                                 HadoopShims.ByteBufferPoolShim pool) throws IOException {
          ./src/java/org/apache/orc/impl/ZlibCodec.java:  private static final HadoopShims SHIMS = HadoopShims.Factory.get();
          ./src/java/org/apache/orc/impl/ZlibCodec.java:            HadoopShims.DirectCompressionType.ZLIB_NOHEADER) != null) {
          ./src/java/org/apache/orc/impl/ZlibCodec.java:    HadoopShims.DirectDecompressor decompressShim =
          ./src/java/org/apache/orc/impl/ZlibCodec.java:        SHIMS.getDirectDecompressor(HadoopShims.DirectCompressionType.ZLIB_NOHEADER);
          ./src/java/org/apache/orc/MemoryManager.java:import org.apache.hadoop.fs.Path;
          ./src/java/org/apache/orc/OrcConf.java:import org.apache.hadoop.conf.Configuration;
          ./src/java/org/apache/orc/OrcConf.java:      "Use zerocopy reads with ORC. (This requires Hadoop 2.3 or later.)"),
          ./src/java/org/apache/orc/OrcFile.java:import org.apache.hadoop.conf.Configuration;
          ./src/java/org/apache/orc/OrcFile.java:import org.apache.hadoop.fs.FSDataInputStream;
          ./src/java/org/apache/orc/OrcFile.java:import org.apache.hadoop.fs.FileSystem;
          ./src/java/org/apache/orc/OrcFile.java:import org.apache.hadoop.fs.Path;
          ./src/java/org/apache/orc/Reader.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/impl/TestBitPack.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/impl/TestBitPack.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/impl/TestBitPack.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/impl/TestColumnStatisticsImpl.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/impl/TestColumnStatisticsImpl.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/impl/TestDataReaderProperties.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/impl/TestDataReaderProperties.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/impl/TestMemoryManager.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/impl/TestMemoryManager.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/impl/TestReaderImpl.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/impl/TestReaderImpl.java:import org.apache.hadoop.fs.FSDataInputStream;
          ./src/test/org/apache/orc/impl/TestReaderImpl.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/impl/TestReaderImpl.java:import org.apache.hadoop.fs.PositionedReadable;
          ./src/test/org/apache/orc/impl/TestReaderImpl.java:import org.apache.hadoop.fs.Seekable;
          ./src/test/org/apache/orc/impl/TestReaderImpl.java:import org.apache.hadoop.io.Text;
          ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java:import org.apache.hadoop.fs.FSDataInputStream;
          ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java:import org.apache.hadoop.fs.FileStatus;
          ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java:import org.apache.hadoop.fs.PositionedReadable;
          ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java:import org.apache.hadoop.fs.Seekable;
          ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java:import org.apache.hadoop.io.DataOutputBuffer;
          ./src/test/org/apache/orc/impl/TestSchemaEvolution.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/impl/TestSchemaEvolution.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/impl/TestSchemaEvolution.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/impl/TestStringRedBlackTree.java:import org.apache.hadoop.io.DataOutputBuffer;
          ./src/test/org/apache/orc/impl/TestStringRedBlackTree.java:import org.apache.hadoop.io.IntWritable;
          ./src/test/org/apache/orc/TestColumnStatistics.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestColumnStatistics.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestColumnStatistics.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestColumnStatistics.java:import org.apache.hadoop.io.BytesWritable;
          ./src/test/org/apache/orc/TestColumnStatistics.java:import org.apache.hadoop.io.Text;
          ./src/test/org/apache/orc/TestNewIntegerEncoding.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestNewIntegerEncoding.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestNewIntegerEncoding.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestOrcNullOptimization.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestOrcNullOptimization.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestOrcNullOptimization.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestOrcTimezone1.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestOrcTimezone1.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestOrcTimezone1.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestOrcTimezone2.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestOrcTimezone2.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestOrcTimezone2.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestOrcTimezone3.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestOrcTimezone3.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestOrcTimezone3.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestOrcTimezonePPD.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestOrcTimezonePPD.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestOrcTimezonePPD.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestReader.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestReader.java:import org.apache.hadoop.fs.FSDataOutputStream;
          ./src/test/org/apache/orc/TestReader.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestReader.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestStringDictionary.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestStringDictionary.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestStringDictionary.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestUnrolledBitPack.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestUnrolledBitPack.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestUnrolledBitPack.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestVectorOrcFile.java:import org.apache.hadoop.conf.Configuration;
          ./src/test/org/apache/orc/TestVectorOrcFile.java:import org.apache.hadoop.fs.FileSystem;
          ./src/test/org/apache/orc/TestVectorOrcFile.java:import org.apache.hadoop.fs.Path;
          ./src/test/org/apache/orc/TestVectorOrcFile.java:import org.apache.hadoop.io.BytesWritable;
          ./src/test/org/apache/orc/TestVectorOrcFile.java:import org.apache.hadoop.io.Text;
          ./src/test/resources/log4j.properties:log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR
          

          Dependency (hell) of Hadoop:

           
          [INFO] |  +- org.apache.hadoop:hadoop-common:jar:2.6.4:compile
          [INFO] |  |  +- org.apache.hadoop:hadoop-annotations:jar:2.6.4:compile
          [INFO] |  |  +- com.google.guava:guava:jar:11.0.2:compile
          [INFO] |  |  +- commons-cli:commons-cli:jar:1.2:compile
          [INFO] |  |  +- org.apache.commons:commons-math3:jar:3.1.1:compile
          [INFO] |  |  +- xmlenc:xmlenc:jar:0.52:compile
          [INFO] |  |  +- commons-httpclient:commons-httpclient:jar:3.1:compile
          [INFO] |  |  |  +- (commons-logging:commons-logging:jar:1.0.4:compile - omitted for conflict with 1.1.3)
          [INFO] |  |  |  \- (commons-codec:commons-codec:jar:1.2:compile - omitted for conflict with 1.4)
          [INFO] |  |  +- commons-codec:commons-codec:jar:1.4:compile
          [INFO] |  |  +- commons-io:commons-io:jar:2.4:compile
          [INFO] |  |  +- commons-net:commons-net:jar:3.1:compile
          [INFO] |  |  +- commons-collections:commons-collections:jar:3.2.2:compile
          [INFO] |  |  +- com.sun.jersey:jersey-core:jar:1.9:compile
          [INFO] |  |  +- com.sun.jersey:jersey-json:jar:1.9:compile
          [INFO] |  |  |  +- org.codehaus.jettison:jettison:jar:1.1:compile
          [INFO] |  |  |  +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:compile
          [INFO] |  |  |  |  \- javax.xml.bind:jaxb-api:jar:2.2.2:compile
          [INFO] |  |  |  |     +- javax.xml.stream:stax-api:jar:1.0-2:compile
          [INFO] |  |  |  |     \- javax.activation:activation:jar:1.1:compile
          [INFO] |  |  |  +- (org.codehaus.jackson:jackson-core-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13)
          [INFO] |  |  |  +- (org.codehaus.jackson:jackson-mapper-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13)
          [INFO] |  |  |  +- org.codehaus.jackson:jackson-jaxrs:jar:1.8.3:compile
          [INFO] |  |  |  |  +- (org.codehaus.jackson:jackson-core-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13)
          [INFO] |  |  |  |  \- (org.codehaus.jackson:jackson-mapper-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13)
          [INFO] |  |  |  +- org.codehaus.jackson:jackson-xc:jar:1.8.3:compile
          [INFO] |  |  |  |  +- (org.codehaus.jackson:jackson-core-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13)
          [INFO] |  |  |  |  \- (org.codehaus.jackson:jackson-mapper-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13)
          [INFO] |  |  |  \- (com.sun.jersey:jersey-core:jar:1.9:compile - omitted for duplicate)
          [INFO] |  |  +- com.sun.jersey:jersey-server:jar:1.9:compile
          [INFO] |  |  |  +- asm:asm:jar:3.1:compile
          [INFO] |  |  |  \- (com.sun.jersey:jersey-core:jar:1.9:compile - omitted for duplicate)
          [INFO] |  |  +- tomcat:jasper-compiler:jar:5.5.23:runtime
          [INFO] |  |  +- tomcat:jasper-runtime:jar:5.5.23:runtime
          [INFO] |  |  |  \- (commons-el:commons-el:jar:1.0:runtime - omitted for duplicate)
          [INFO] |  |  +- commons-el:commons-el:jar:1.0:runtime
          [INFO] |  |  |  \- (commons-logging:commons-logging:jar:1.0.3:runtime - omitted for conflict with 1.0.4)
          [INFO] |  |  +- commons-logging:commons-logging:jar:1.1.3:compile
          [INFO] |  |  +- (log4j:log4j:jar:1.2.17:compile - omitted for duplicate)
          [INFO] |  |  +- net.java.dev.jets3t:jets3t:jar:0.9.0:compile
          [INFO] |  |  |  +- (commons-codec:commons-codec:jar:1.4:compile - omitted for duplicate)
          [INFO] |  |  |  +- (commons-logging:commons-logging:jar:1.1.1:compile - omitted for conflict with 1.1.3)
          [INFO] |  |  |  +- org.apache.httpcomponents:httpclient:jar:4.1.2:compile
          [INFO] |  |  |  |  \- (org.apache.httpcomponents:httpcore:jar:4.1.2:compile - omitted for duplicate)
          [INFO] |  |  |  +- org.apache.httpcomponents:httpcore:jar:4.1.2:compile
          [INFO] |  |  |  \- com.jamesmurty.utils:java-xmlbuilder:jar:0.4:compile
          [INFO] |  |  +- (commons-lang:commons-lang:jar:2.6:compile - omitted for duplicate)
          [INFO] |  |  +- commons-configuration:commons-configuration:jar:1.6:compile
          [INFO] |  |  |  +- (commons-collections:commons-collections:jar:3.2.1:compile - omitted for conflict with 3.2.2)
          [INFO] |  |  |  +- (commons-lang:commons-lang:jar:2.4:compile - omitted for conflict with 2.6)
          [INFO] |  |  |  +- (commons-logging:commons-logging:jar:1.1.1:compile - omitted for conflict with 1.1.3)
          [INFO] |  |  |  +- commons-digester:commons-digester:jar:1.8:compile
          [INFO] |  |  |  |  +- commons-beanutils:commons-beanutils:jar:1.7.0:compile
          [INFO] |  |  |  |  |  \- (commons-logging:commons-logging:jar:1.0.3:compile - omitted for conflict with 1.1.3)
          [INFO] |  |  |  |  \- (commons-logging:commons-logging:jar:1.1:compile - omitted for conflict with 1.1.3)
          [INFO] |  |  |  \- commons-beanutils:commons-beanutils-core:jar:1.8.0:compile
          [INFO] |  |  |     \- (commons-logging:commons-logging:jar:1.1.1:compile - omitted for conflict with 1.1.3)
          [INFO] |  |  +- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7)
          [INFO] |  |  +- (org.slf4j:slf4j-log4j12:jar:1.7.5:compile - scope updated from runtime; omitted for duplicate)
          [INFO] |  |  +- (org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile - omitted for duplicate)
          [INFO] |  |  +- (org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile - omitted for duplicate)
          [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:2.5.0:compile - omitted for duplicate)
          [INFO] |  |  +- com.google.code.gson:gson:jar:2.2.4:compile
          [INFO] |  |  +- org.apache.hadoop:hadoop-auth:jar:2.6.4:compile
          [INFO] |  |  |  +- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7)
          [INFO] |  |  |  +- (commons-codec:commons-codec:jar:1.4:compile - omitted for duplicate)
          [INFO] |  |  |  +- (log4j:log4j:jar:1.2.17:runtime - omitted for duplicate)
          [INFO] |  |  |  +- (org.slf4j:slf4j-log4j12:jar:1.7.5:runtime - omitted for duplicate)
          [INFO] |  |  |  +- (org.apache.httpcomponents:httpclient:jar:4.2.5:compile - omitted for conflict with 4.1.2)
          [INFO] |  |  |  +- org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:compile
          [INFO] |  |  |  |  +- org.apache.directory.server:apacheds-i18n:jar:2.0.0-M15:compile
          [INFO] |  |  |  |  |  \- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7)
          [INFO] |  |  |  |  +- org.apache.directory.api:api-asn1-api:jar:1.0.0-M20:compile
          [INFO] |  |  |  |  |  \- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7)
          [INFO] |  |  |  |  +- org.apache.directory.api:api-util:jar:1.0.0-M20:compile
          [INFO] |  |  |  |  |  \- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7)
          [INFO] |  |  |  |  \- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7)
          [INFO] |  |  |  +- (org.apache.zookeeper:zookeeper:jar:3.4.6:compile - omitted for duplicate)
          [INFO] |  |  |  \- org.apache.curator:curator-framework:jar:2.6.0:compile
          [INFO] |  |  |     +- (org.apache.curator:curator-client:jar:2.6.0:compile - omitted for duplicate)
          [INFO] |  |  |     +- (org.apache.zookeeper:zookeeper:jar:3.4.6:compile - omitted for duplicate)
          [INFO] |  |  |     \- (com.google.guava:guava:jar:16.0.1:compile - omitted for conflict with 11.0.2)
          [INFO] |  |  +- com.jcraft:jsch:jar:0.1.42:compile
          [INFO] |  |  +- org.apache.curator:curator-client:jar:2.6.0:compile
          [INFO] |  |  |  +- (org.slf4j:slf4j-api:jar:1.7.6:compile - omitted for conflict with 1.7.7)
          [INFO] |  |  |  +- (org.apache.zookeeper:zookeeper:jar:3.4.6:compile - omitted for duplicate)
          [INFO] |  |  |  \- (com.google.guava:guava:jar:16.0.1:compile - omitted for conflict with 11.0.2)
          [INFO] |  |  +- org.apache.curator:curator-recipes:jar:2.6.0:compile
          [INFO] |  |  |  +- (org.apache.curator:curator-framework:jar:2.6.0:compile - omitted for duplicate)
          [INFO] |  |  |  +- (org.apache.zookeeper:zookeeper:jar:3.4.6:compile - omitted for duplicate)
          [INFO] |  |  |  \- (com.google.guava:guava:jar:16.0.1:compile - omitted for conflict with 11.0.2)
          [INFO] |  |  +- org.htrace:htrace-core:jar:3.0.4:compile
          [INFO] |  |  |  +- (com.google.guava:guava:jar:12.0.1:compile - omitted for conflict with 11.0.2)
          [INFO] |  |  |  \- (commons-logging:commons-logging:jar:1.1.1:compile - omitted for conflict with 1.1.3)
          [INFO] |  |  +- org.apache.zookeeper:zookeeper:jar:3.4.6:compile
          [INFO] |  |  |  +- (org.slf4j:slf4j-api:jar:1.6.1:compile - omitted for conflict with 1.7.7)
          [INFO] |  |  |  +- org.slf4j:slf4j-log4j12:jar:1.6.1:compile
          [INFO] |  |  |  |  +- (org.slf4j:slf4j-api:jar:1.6.1:compile - omitted for conflict with 1.7.7)
          [INFO] |  |  |  |  \- (log4j:log4j:jar:1.2.16:compile - omitted for conflict with 1.2.17)
          [INFO] |  |  |  +- (log4j:log4j:jar:1.2.16:compile - omitted for conflict with 1.2.17)
          [INFO] |  |  |  \- io.netty:netty:jar:3.7.0.Final:compile
          [INFO] |  |  \- (org.apache.commons:commons-compress:jar:1.4.1:compile - omitted for conflict with 1.8.1)
          [INFO] |  +- org.apache.hive:hive-storage-api:jar:2.2.0:compile
          [INFO] |  |  +- (commons-lang:commons-lang:jar:2.6:compile - omitted for duplicate)
          [INFO] |  |  \- (org.slf4j:slf4j-api:jar:1.7.10:compile - omitted for conflict with 1.7.7)
          [INFO] |  \- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7)
          
          Show
          istvan Istvan Szukacs added a comment - I was thking about splitting orc-core to orc-hadoop and orc-standalone and have different implementation for the reader and writer where the path and configuration are implemented differently. Hadoop in orc-core: ./pom.xml: <groupId>org.apache.hadoop</groupId> ./pom.xml: <artifactId>hadoop-common</artifactId> ./src/java/org/apache/orc/impl/ColumnStatisticsImpl.java: import org.apache.hadoop.io.BytesWritable; ./src/java/org/apache/orc/impl/ColumnStatisticsImpl.java: import org.apache.hadoop.io.Text; ./src/java/org/apache/orc/impl/ColumnStatisticsImpl.java: import org.apache.hadoop.io.WritableComparator; ./src/java/org/apache/orc/impl/DataReaderProperties.java: import org.apache.hadoop.fs.FileSystem; ./src/java/org/apache/orc/impl/DataReaderProperties.java: import org.apache.hadoop.fs.Path; ./src/java/org/apache/orc/impl/DynamicByteArray.java: import org.apache.hadoop.io.Text; ./src/java/org/apache/orc/impl/HadoopShims.java: import org.apache.hadoop.fs.FSDataInputStream; ./src/java/org/apache/orc/impl/HadoopShims.java: import org.apache.hadoop.io.Text; ./src/java/org/apache/orc/impl/HadoopShims.java: import org.apache.hadoop.util.VersionInfo; ./src/java/org/apache/orc/impl/HadoopShims.java: public interface HadoopShims { ./src/java/org/apache/orc/impl/HadoopShims.java: * a hadoop.io ByteBufferPool shim. ./src/java/org/apache/orc/impl/HadoopShims.java: private static HadoopShims SHIMS = null ; ./src/java/org/apache/orc/impl/HadoopShims.java: public static synchronized HadoopShims get() { ./src/java/org/apache/orc/impl/HadoopShims.java: SHIMS = new HadoopShims_2_2(); ./src/java/org/apache/orc/impl/HadoopShims.java: SHIMS = new HadoopShimsCurrent(); ./src/java/org/apache/orc/impl/HadoopShims_2_2.java: import org.apache.hadoop.fs.FSDataInputStream; ./src/java/org/apache/orc/impl/HadoopShims_2_2.java: import org.apache.hadoop.io.Text; ./src/java/org/apache/orc/impl/HadoopShims_2_2.java: * Shims for versions of Hadoop up to and including 2.2.x ./src/java/org/apache/orc/impl/HadoopShims_2_2.java: public class HadoopShims_2_2 implements HadoopShims { ./src/java/org/apache/orc/impl/HadoopShims_2_2.java: HadoopShims_2_2() { ./src/java/org/apache/orc/impl/HadoopShims_2_2.java: Class .forName( "org.apache.hadoop.fs.CacheFlag" , false , ./src/java/org/apache/orc/impl/HadoopShims_2_2.java: HadoopShims_2_2.class.getClassLoader()); ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java: import org.apache.hadoop.fs.FSDataInputStream; ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java: import org.apache.hadoop.io.Text; ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java: import org.apache.hadoop.io.compress.snappy.SnappyDecompressor; ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java: import org.apache.hadoop.io.compress.zlib.ZlibDecompressor; ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java: * Shims for recent versions of Hadoop ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java: public class HadoopShimsCurrent implements HadoopShims { ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java: private final org.apache.hadoop.io.compress.DirectDecompressor root; ./src/java/org/apache/orc/impl/HadoopShimsCurrent.java: DirectDecompressWrapper(org.apache.hadoop.io.compress.DirectDecompressor root) { ./src/java/org/apache/orc/impl/MemoryManager.java: import org.apache.hadoop.conf.Configuration; ./src/java/org/apache/orc/impl/MemoryManagerImpl.java: import org.apache.hadoop.conf.Configuration; ./src/java/org/apache/orc/impl/MemoryManagerImpl.java: import org.apache.hadoop.fs.Path; ./src/java/org/apache/orc/impl/OrcAcidUtils.java: import org.apache.hadoop.fs.FSDataInputStream; ./src/java/org/apache/orc/impl/OrcAcidUtils.java: import org.apache.hadoop.fs.FileSystem; ./src/java/org/apache/orc/impl/OrcAcidUtils.java: import org.apache.hadoop.fs.Path; ./src/java/org/apache/orc/impl/PhysicalFsWriter.java: import org.apache.hadoop.fs.FSDataOutputStream; ./src/java/org/apache/orc/impl/PhysicalFsWriter.java: import org.apache.hadoop.fs.FileSystem; ./src/java/org/apache/orc/impl/PhysicalFsWriter.java: import org.apache.hadoop.fs.Path; ./src/java/org/apache/orc/impl/ReaderImpl.java: import org.apache.hadoop.fs.FileStatus; ./src/java/org/apache/orc/impl/ReaderImpl.java: import org.apache.hadoop.conf.Configuration; ./src/java/org/apache/orc/impl/ReaderImpl.java: import org.apache.hadoop.fs.FSDataInputStream; ./src/java/org/apache/orc/impl/ReaderImpl.java: import org.apache.hadoop.fs.FileSystem; ./src/java/org/apache/orc/impl/ReaderImpl.java: import org.apache.hadoop.fs.Path; ./src/java/org/apache/orc/impl/ReaderImpl.java: import org.apache.hadoop.io.Text; ./src/java/org/apache/orc/impl/RecordReaderImpl.java: import org.apache.hadoop.fs.Path; ./src/java/org/apache/orc/impl/RecordReaderImpl.java: import org.apache.hadoop.io.Text; ./src/java/org/apache/orc/impl/RecordReaderUtils.java: import org.apache.hadoop.fs.FSDataInputStream; ./src/java/org/apache/orc/impl/RecordReaderUtils.java: import org.apache.hadoop.fs.FileSystem; ./src/java/org/apache/orc/impl/RecordReaderUtils.java: import org.apache.hadoop.fs.Path; ./src/java/org/apache/orc/impl/RecordReaderUtils.java: private static final HadoopShims SHIMS = HadoopShims.Factory.get(); ./src/java/org/apache/orc/impl/RecordReaderUtils.java: private HadoopShims.ZeroCopyReaderShim zcr = null ; ./src/java/org/apache/orc/impl/RecordReaderUtils.java: try (HadoopShims.ZeroCopyReaderShim myZcr = zcr) { ./src/java/org/apache/orc/impl/RecordReaderUtils.java: HadoopShims.ZeroCopyReaderShim zcr, ./src/java/org/apache/orc/impl/RecordReaderUtils.java: static HadoopShims.ZeroCopyReaderShim createZeroCopyShim(FSDataInputStream file, ./src/java/org/apache/orc/impl/RecordReaderUtils.java: // this is an implementation copied from ElasticByteBufferPool in hadoop-2, ./src/java/org/apache/orc/impl/RecordReaderUtils.java: public final static class ByteBufferAllocatorPool implements HadoopShims.ByteBufferPoolShim { ./src/java/org/apache/orc/impl/SchemaEvolution.java: import org.apache.hadoop.conf.Configuration; ./src/java/org/apache/orc/impl/SnappyCodec.java: private static final HadoopShims SHIMS = HadoopShims.Factory.get(); ./src/java/org/apache/orc/impl/SnappyCodec.java: HadoopShims.DirectCompressionType.SNAPPY) != null ) { ./src/java/org/apache/orc/impl/SnappyCodec.java: HadoopShims.DirectDecompressor decompressShim = ./src/java/org/apache/orc/impl/SnappyCodec.java: SHIMS.getDirectDecompressor(HadoopShims.DirectCompressionType.SNAPPY); ./src/java/org/apache/orc/impl/StringRedBlackTree.java: import org.apache.hadoop.io.Text; ./src/java/org/apache/orc/impl/TreeReaderFactory.java: private static final HadoopShims SHIMS = HadoopShims.Factory.get(); ./src/java/org/apache/orc/impl/WriterImpl.java: import org.apache.hadoop.conf.Configuration; ./src/java/org/apache/orc/impl/WriterImpl.java: import org.apache.hadoop.fs.FileSystem; ./src/java/org/apache/orc/impl/WriterImpl.java: import org.apache.hadoop.fs.Path; ./src/java/org/apache/orc/impl/WriterImpl.java: import org.apache.hadoop.io.Text; ./src/java/org/apache/orc/impl/ZeroCopyShims.java: import org.apache.hadoop.fs.FSDataInputStream; ./src/java/org/apache/orc/impl/ZeroCopyShims.java: import org.apache.hadoop.fs.ReadOption; ./src/java/org/apache/orc/impl/ZeroCopyShims.java: import org.apache.hadoop.io.ByteBufferPool; ./src/java/org/apache/orc/impl/ZeroCopyShims.java: private HadoopShims.ByteBufferPoolShim pool; ./src/java/org/apache/orc/impl/ZeroCopyShims.java: public ByteBufferPoolAdapter(HadoopShims.ByteBufferPoolShim pool) { ./src/java/org/apache/orc/impl/ZeroCopyShims.java: private static final class ZeroCopyAdapter implements HadoopShims.ZeroCopyReaderShim { ./src/java/org/apache/orc/impl/ZeroCopyShims.java: HadoopShims.ByteBufferPoolShim poolshim) { ./src/java/org/apache/orc/impl/ZeroCopyShims.java: public static HadoopShims.ZeroCopyReaderShim getZeroCopyReader(FSDataInputStream in, ./src/java/org/apache/orc/impl/ZeroCopyShims.java: HadoopShims.ByteBufferPoolShim pool) throws IOException { ./src/java/org/apache/orc/impl/ZlibCodec.java: private static final HadoopShims SHIMS = HadoopShims.Factory.get(); ./src/java/org/apache/orc/impl/ZlibCodec.java: HadoopShims.DirectCompressionType.ZLIB_NOHEADER) != null ) { ./src/java/org/apache/orc/impl/ZlibCodec.java: HadoopShims.DirectDecompressor decompressShim = ./src/java/org/apache/orc/impl/ZlibCodec.java: SHIMS.getDirectDecompressor(HadoopShims.DirectCompressionType.ZLIB_NOHEADER); ./src/java/org/apache/orc/MemoryManager.java: import org.apache.hadoop.fs.Path; ./src/java/org/apache/orc/OrcConf.java: import org.apache.hadoop.conf.Configuration; ./src/java/org/apache/orc/OrcConf.java: "Use zerocopy reads with ORC. (This requires Hadoop 2.3 or later.)" ), ./src/java/org/apache/orc/OrcFile.java: import org.apache.hadoop.conf.Configuration; ./src/java/org/apache/orc/OrcFile.java: import org.apache.hadoop.fs.FSDataInputStream; ./src/java/org/apache/orc/OrcFile.java: import org.apache.hadoop.fs.FileSystem; ./src/java/org/apache/orc/OrcFile.java: import org.apache.hadoop.fs.Path; ./src/java/org/apache/orc/Reader.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/impl/TestBitPack.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/impl/TestBitPack.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/impl/TestBitPack.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/impl/TestColumnStatisticsImpl.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/impl/TestColumnStatisticsImpl.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/impl/TestDataReaderProperties.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/impl/TestDataReaderProperties.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/impl/TestMemoryManager.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/impl/TestMemoryManager.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/impl/TestReaderImpl.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/impl/TestReaderImpl.java: import org.apache.hadoop.fs.FSDataInputStream; ./src/test/org/apache/orc/impl/TestReaderImpl.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/impl/TestReaderImpl.java: import org.apache.hadoop.fs.PositionedReadable; ./src/test/org/apache/orc/impl/TestReaderImpl.java: import org.apache.hadoop.fs.Seekable; ./src/test/org/apache/orc/impl/TestReaderImpl.java: import org.apache.hadoop.io.Text; ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java: import org.apache.hadoop.fs.FSDataInputStream; ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java: import org.apache.hadoop.fs.FileStatus; ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java: import org.apache.hadoop.fs.PositionedReadable; ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java: import org.apache.hadoop.fs.Seekable; ./src/test/org/apache/orc/impl/TestRecordReaderImpl.java: import org.apache.hadoop.io.DataOutputBuffer; ./src/test/org/apache/orc/impl/TestSchemaEvolution.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/impl/TestSchemaEvolution.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/impl/TestSchemaEvolution.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/impl/TestStringRedBlackTree.java: import org.apache.hadoop.io.DataOutputBuffer; ./src/test/org/apache/orc/impl/TestStringRedBlackTree.java: import org.apache.hadoop.io.IntWritable; ./src/test/org/apache/orc/TestColumnStatistics.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestColumnStatistics.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestColumnStatistics.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestColumnStatistics.java: import org.apache.hadoop.io.BytesWritable; ./src/test/org/apache/orc/TestColumnStatistics.java: import org.apache.hadoop.io.Text; ./src/test/org/apache/orc/TestNewIntegerEncoding.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestNewIntegerEncoding.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestNewIntegerEncoding.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestOrcNullOptimization.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestOrcNullOptimization.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestOrcNullOptimization.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestOrcTimezone1.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestOrcTimezone1.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestOrcTimezone1.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestOrcTimezone2.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestOrcTimezone2.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestOrcTimezone2.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestOrcTimezone3.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestOrcTimezone3.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestOrcTimezone3.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestOrcTimezonePPD.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestOrcTimezonePPD.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestOrcTimezonePPD.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestReader.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestReader.java: import org.apache.hadoop.fs.FSDataOutputStream; ./src/test/org/apache/orc/TestReader.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestReader.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestStringDictionary.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestStringDictionary.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestStringDictionary.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestUnrolledBitPack.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestUnrolledBitPack.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestUnrolledBitPack.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestVectorOrcFile.java: import org.apache.hadoop.conf.Configuration; ./src/test/org/apache/orc/TestVectorOrcFile.java: import org.apache.hadoop.fs.FileSystem; ./src/test/org/apache/orc/TestVectorOrcFile.java: import org.apache.hadoop.fs.Path; ./src/test/org/apache/orc/TestVectorOrcFile.java: import org.apache.hadoop.io.BytesWritable; ./src/test/org/apache/orc/TestVectorOrcFile.java: import org.apache.hadoop.io.Text; ./src/test/resources/log4j.properties:log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR Dependency (hell) of Hadoop: [INFO] | +- org.apache.hadoop:hadoop-common:jar:2.6.4:compile [INFO] | | +- org.apache.hadoop:hadoop-annotations:jar:2.6.4:compile [INFO] | | +- com.google.guava:guava:jar:11.0.2:compile [INFO] | | +- commons-cli:commons-cli:jar:1.2:compile [INFO] | | +- org.apache.commons:commons-math3:jar:3.1.1:compile [INFO] | | +- xmlenc:xmlenc:jar:0.52:compile [INFO] | | +- commons-httpclient:commons-httpclient:jar:3.1:compile [INFO] | | | +- (commons-logging:commons-logging:jar:1.0.4:compile - omitted for conflict with 1.1.3) [INFO] | | | \- (commons-codec:commons-codec:jar:1.2:compile - omitted for conflict with 1.4) [INFO] | | +- commons-codec:commons-codec:jar:1.4:compile [INFO] | | +- commons-io:commons-io:jar:2.4:compile [INFO] | | +- commons-net:commons-net:jar:3.1:compile [INFO] | | +- commons-collections:commons-collections:jar:3.2.2:compile [INFO] | | +- com.sun.jersey:jersey-core:jar:1.9:compile [INFO] | | +- com.sun.jersey:jersey-json:jar:1.9:compile [INFO] | | | +- org.codehaus.jettison:jettison:jar:1.1:compile [INFO] | | | +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:compile [INFO] | | | | \- javax.xml.bind:jaxb-api:jar:2.2.2:compile [INFO] | | | | +- javax.xml.stream:stax-api:jar:1.0-2:compile [INFO] | | | | \- javax.activation:activation:jar:1.1:compile [INFO] | | | +- (org.codehaus.jackson:jackson-core-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13) [INFO] | | | +- (org.codehaus.jackson:jackson-mapper-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13) [INFO] | | | +- org.codehaus.jackson:jackson-jaxrs:jar:1.8.3:compile [INFO] | | | | +- (org.codehaus.jackson:jackson-core-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13) [INFO] | | | | \- (org.codehaus.jackson:jackson-mapper-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13) [INFO] | | | +- org.codehaus.jackson:jackson-xc:jar:1.8.3:compile [INFO] | | | | +- (org.codehaus.jackson:jackson-core-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13) [INFO] | | | | \- (org.codehaus.jackson:jackson-mapper-asl:jar:1.8.3:compile - omitted for conflict with 1.9.13) [INFO] | | | \- (com.sun.jersey:jersey-core:jar:1.9:compile - omitted for duplicate) [INFO] | | +- com.sun.jersey:jersey-server:jar:1.9:compile [INFO] | | | +- asm:asm:jar:3.1:compile [INFO] | | | \- (com.sun.jersey:jersey-core:jar:1.9:compile - omitted for duplicate) [INFO] | | +- tomcat:jasper-compiler:jar:5.5.23:runtime [INFO] | | +- tomcat:jasper-runtime:jar:5.5.23:runtime [INFO] | | | \- (commons-el:commons-el:jar:1.0:runtime - omitted for duplicate) [INFO] | | +- commons-el:commons-el:jar:1.0:runtime [INFO] | | | \- (commons-logging:commons-logging:jar:1.0.3:runtime - omitted for conflict with 1.0.4) [INFO] | | +- commons-logging:commons-logging:jar:1.1.3:compile [INFO] | | +- (log4j:log4j:jar:1.2.17:compile - omitted for duplicate) [INFO] | | +- net.java.dev.jets3t:jets3t:jar:0.9.0:compile [INFO] | | | +- (commons-codec:commons-codec:jar:1.4:compile - omitted for duplicate) [INFO] | | | +- (commons-logging:commons-logging:jar:1.1.1:compile - omitted for conflict with 1.1.3) [INFO] | | | +- org.apache.httpcomponents:httpclient:jar:4.1.2:compile [INFO] | | | | \- (org.apache.httpcomponents:httpcore:jar:4.1.2:compile - omitted for duplicate) [INFO] | | | +- org.apache.httpcomponents:httpcore:jar:4.1.2:compile [INFO] | | | \- com.jamesmurty.utils:java-xmlbuilder:jar:0.4:compile [INFO] | | +- (commons-lang:commons-lang:jar:2.6:compile - omitted for duplicate) [INFO] | | +- commons-configuration:commons-configuration:jar:1.6:compile [INFO] | | | +- (commons-collections:commons-collections:jar:3.2.1:compile - omitted for conflict with 3.2.2) [INFO] | | | +- (commons-lang:commons-lang:jar:2.4:compile - omitted for conflict with 2.6) [INFO] | | | +- (commons-logging:commons-logging:jar:1.1.1:compile - omitted for conflict with 1.1.3) [INFO] | | | +- commons-digester:commons-digester:jar:1.8:compile [INFO] | | | | +- commons-beanutils:commons-beanutils:jar:1.7.0:compile [INFO] | | | | | \- (commons-logging:commons-logging:jar:1.0.3:compile - omitted for conflict with 1.1.3) [INFO] | | | | \- (commons-logging:commons-logging:jar:1.1:compile - omitted for conflict with 1.1.3) [INFO] | | | \- commons-beanutils:commons-beanutils-core:jar:1.8.0:compile [INFO] | | | \- (commons-logging:commons-logging:jar:1.1.1:compile - omitted for conflict with 1.1.3) [INFO] | | +- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7) [INFO] | | +- (org.slf4j:slf4j-log4j12:jar:1.7.5:compile - scope updated from runtime; omitted for duplicate) [INFO] | | +- (org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile - omitted for duplicate) [INFO] | | +- (org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile - omitted for duplicate) [INFO] | | +- (com.google.protobuf:protobuf-java:jar:2.5.0:compile - omitted for duplicate) [INFO] | | +- com.google.code.gson:gson:jar:2.2.4:compile [INFO] | | +- org.apache.hadoop:hadoop-auth:jar:2.6.4:compile [INFO] | | | +- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7) [INFO] | | | +- (commons-codec:commons-codec:jar:1.4:compile - omitted for duplicate) [INFO] | | | +- (log4j:log4j:jar:1.2.17:runtime - omitted for duplicate) [INFO] | | | +- (org.slf4j:slf4j-log4j12:jar:1.7.5:runtime - omitted for duplicate) [INFO] | | | +- (org.apache.httpcomponents:httpclient:jar:4.2.5:compile - omitted for conflict with 4.1.2) [INFO] | | | +- org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:compile [INFO] | | | | +- org.apache.directory.server:apacheds-i18n:jar:2.0.0-M15:compile [INFO] | | | | | \- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7) [INFO] | | | | +- org.apache.directory.api:api-asn1-api:jar:1.0.0-M20:compile [INFO] | | | | | \- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7) [INFO] | | | | +- org.apache.directory.api:api-util:jar:1.0.0-M20:compile [INFO] | | | | | \- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7) [INFO] | | | | \- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7) [INFO] | | | +- (org.apache.zookeeper:zookeeper:jar:3.4.6:compile - omitted for duplicate) [INFO] | | | \- org.apache.curator:curator-framework:jar:2.6.0:compile [INFO] | | | +- (org.apache.curator:curator-client:jar:2.6.0:compile - omitted for duplicate) [INFO] | | | +- (org.apache.zookeeper:zookeeper:jar:3.4.6:compile - omitted for duplicate) [INFO] | | | \- (com.google.guava:guava:jar:16.0.1:compile - omitted for conflict with 11.0.2) [INFO] | | +- com.jcraft:jsch:jar:0.1.42:compile [INFO] | | +- org.apache.curator:curator-client:jar:2.6.0:compile [INFO] | | | +- (org.slf4j:slf4j-api:jar:1.7.6:compile - omitted for conflict with 1.7.7) [INFO] | | | +- (org.apache.zookeeper:zookeeper:jar:3.4.6:compile - omitted for duplicate) [INFO] | | | \- (com.google.guava:guava:jar:16.0.1:compile - omitted for conflict with 11.0.2) [INFO] | | +- org.apache.curator:curator-recipes:jar:2.6.0:compile [INFO] | | | +- (org.apache.curator:curator-framework:jar:2.6.0:compile - omitted for duplicate) [INFO] | | | +- (org.apache.zookeeper:zookeeper:jar:3.4.6:compile - omitted for duplicate) [INFO] | | | \- (com.google.guava:guava:jar:16.0.1:compile - omitted for conflict with 11.0.2) [INFO] | | +- org.htrace:htrace-core:jar:3.0.4:compile [INFO] | | | +- (com.google.guava:guava:jar:12.0.1:compile - omitted for conflict with 11.0.2) [INFO] | | | \- (commons-logging:commons-logging:jar:1.1.1:compile - omitted for conflict with 1.1.3) [INFO] | | +- org.apache.zookeeper:zookeeper:jar:3.4.6:compile [INFO] | | | +- (org.slf4j:slf4j-api:jar:1.6.1:compile - omitted for conflict with 1.7.7) [INFO] | | | +- org.slf4j:slf4j-log4j12:jar:1.6.1:compile [INFO] | | | | +- (org.slf4j:slf4j-api:jar:1.6.1:compile - omitted for conflict with 1.7.7) [INFO] | | | | \- (log4j:log4j:jar:1.2.16:compile - omitted for conflict with 1.2.17) [INFO] | | | +- (log4j:log4j:jar:1.2.16:compile - omitted for conflict with 1.2.17) [INFO] | | | \- io.netty:netty:jar:3.7.0.Final:compile [INFO] | | \- (org.apache.commons:commons-compress:jar:1.4.1:compile - omitted for conflict with 1.8.1) [INFO] | +- org.apache.hive:hive-storage-api:jar:2.2.0:compile [INFO] | | +- (commons-lang:commons-lang:jar:2.6:compile - omitted for duplicate) [INFO] | | \- (org.slf4j:slf4j-api:jar:1.7.10:compile - omitted for conflict with 1.7.7) [INFO] | \- (org.slf4j:slf4j-api:jar:1.7.5:compile - omitted for conflict with 1.7.7)
          Hide
          owen.omalley Owen O'Malley added a comment -

          Ok, I modified my dependency graph generator to find the jars that we aren't using at all:

          com.sun.jersey:jersey-json:jar:1.9:compile used: 0, used duplicate: 0, unused: 92
          com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:compile used: 0, used duplicate: 0, unused: 660
          javax.xml.bind:jaxb-api:jar:2.2.2:compile used: 0, used duplicate: 0, unused: 101
          javax.xml.stream:stax-api:jar:1.0-2:compile used: 0, used duplicate: 0, unused: 37
          javax.activation:activation:jar:1.1:compile used: 0, used duplicate: 0, unused: 38
          org.codehaus.jackson:jackson-jaxrs:jar:1.8.3:compile used: 0, used duplicate: 0, unused: 8
          org.codehaus.jackson:jackson-xc:jar:1.8.3:compile used: 0, used duplicate: 0, unused: 12
          tomcat:jasper-compiler:jar:5.5.23:runtime used: 0, used duplicate: 0, unused: 180
          tomcat:jasper-runtime:jar:5.5.23:compile used: 0, used duplicate: 0, unused: 46
          commons-el:commons-el:jar:1.0:compile used: 0, used duplicate: 0, unused: 68
          net.java.dev.jets3t:jets3t:jar:0.9.0:compile used: 0, used duplicate: 0, unused: 312
          com.jamesmurty.utils:java-xmlbuilder:jar:0.4:compile used: 0, used duplicate: 0, unused: 4
          commons-digester:commons-digester:jar:1.8:compile used: 0, used duplicate: 0, unused: 100
          org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile used: 0, used duplicate: 0, unused: 502
          org.apache.curator:curator-recipes:jar:2.6.0:compile used: 0, used duplicate: 0, unused: 190
          org.tukaani:xz:jar:1.0:compile used: 0, used duplicate: 0, unused: 102
          commons-daemon:commons-daemon:jar:1.0.13:compile used: 0, used duplicate: 0, unused: 14
          io.netty:netty:jar:3.6.2.Final:compile used: 0, used duplicate: 0, unused: 811
          xerces:xercesImpl:jar:2.9.1:compile used: 0, used duplicate: 0, unused: 894
          stax:stax-api:jar:1.0.1:compile used: 0, used duplicate: 0, unused: 40

          Show
          owen.omalley Owen O'Malley added a comment - Ok, I modified my dependency graph generator to find the jars that we aren't using at all: com.sun.jersey:jersey-json:jar:1.9:compile used: 0, used duplicate: 0, unused: 92 com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:compile used: 0, used duplicate: 0, unused: 660 javax.xml.bind:jaxb-api:jar:2.2.2:compile used: 0, used duplicate: 0, unused: 101 javax.xml.stream:stax-api:jar:1.0-2:compile used: 0, used duplicate: 0, unused: 37 javax.activation:activation:jar:1.1:compile used: 0, used duplicate: 0, unused: 38 org.codehaus.jackson:jackson-jaxrs:jar:1.8.3:compile used: 0, used duplicate: 0, unused: 8 org.codehaus.jackson:jackson-xc:jar:1.8.3:compile used: 0, used duplicate: 0, unused: 12 tomcat:jasper-compiler:jar:5.5.23:runtime used: 0, used duplicate: 0, unused: 180 tomcat:jasper-runtime:jar:5.5.23:compile used: 0, used duplicate: 0, unused: 46 commons-el:commons-el:jar:1.0:compile used: 0, used duplicate: 0, unused: 68 net.java.dev.jets3t:jets3t:jar:0.9.0:compile used: 0, used duplicate: 0, unused: 312 com.jamesmurty.utils:java-xmlbuilder:jar:0.4:compile used: 0, used duplicate: 0, unused: 4 commons-digester:commons-digester:jar:1.8:compile used: 0, used duplicate: 0, unused: 100 org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile used: 0, used duplicate: 0, unused: 502 org.apache.curator:curator-recipes:jar:2.6.0:compile used: 0, used duplicate: 0, unused: 190 org.tukaani:xz:jar:1.0:compile used: 0, used duplicate: 0, unused: 102 commons-daemon:commons-daemon:jar:1.0.13:compile used: 0, used duplicate: 0, unused: 14 io.netty:netty:jar:3.6.2.Final:compile used: 0, used duplicate: 0, unused: 811 xerces:xercesImpl:jar:2.9.1:compile used: 0, used duplicate: 0, unused: 894 stax:stax-api:jar:1.0.1:compile used: 0, used duplicate: 0, unused: 40
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user omalley opened a pull request:

          https://github.com/apache/orc/pull/96

          ORC-151. Minimize the size of the tools uber jar.

          This patch takes the tools uber jar from 29mb to 23mb.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/omalley/orc orc-151

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/orc/pull/96.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #96


          commit b28ab0e98f1d90ba77047d3c767cbc82217c41a5
          Author: Owen O'Malley <omalley@apache.org>
          Date: 2017-02-27T22:09:43Z

          ORC-151. Minimize the size of the tools uber jar.


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user omalley opened a pull request: https://github.com/apache/orc/pull/96 ORC-151 . Minimize the size of the tools uber jar. This patch takes the tools uber jar from 29mb to 23mb. You can merge this pull request into a Git repository by running: $ git pull https://github.com/omalley/orc orc-151 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/orc/pull/96.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #96 commit b28ab0e98f1d90ba77047d3c767cbc82217c41a5 Author: Owen O'Malley <omalley@apache.org> Date: 2017-02-27T22:09:43Z ORC-151 . Minimize the size of the tools uber jar.
          Hide
          owen.omalley Owen O'Malley added a comment -

          Istvan Szukacs It would be a lot of work to completely remove the Hadoop dependencies. If you want to investigate it, you should probably open a new jira. In particular, you'd need to:

          • Remove the hadoop dependence from hive-storage, which would require:
          • Providing alternatives to the uses of Writable without breaking compatibility. At the very least, you'll need a new implementation of DecimalColumnVector, DateColumnVector, and TimestampColumnVector.
          • You'll need to replace the functionality of WritableUtils, but that should be pretty easy.
          • If you are trying to get rid of problematic libraries, you really should remove the use of guava from hive-storage too. I thought I had removed them at one point, but they are still there.
          • You can't change the definition of orc-core, which would break compatibility, but you could create a new module (orc-kernel?) that houses the hadoop-clean code.
          • PhysicalWriter already avoids the FileSystem on the write side, you could try making a PhysicalReader for the read path.
          • You'll need to deal with Configuration in a backwards compatible way.

          All without introducing a performance penalty on the Hadoop users. If your goal is just reducing the size of the jar, it doesn't seem worth the work.

          Show
          owen.omalley Owen O'Malley added a comment - Istvan Szukacs It would be a lot of work to completely remove the Hadoop dependencies. If you want to investigate it, you should probably open a new jira. In particular, you'd need to: Remove the hadoop dependence from hive-storage, which would require: Providing alternatives to the uses of Writable without breaking compatibility. At the very least, you'll need a new implementation of DecimalColumnVector, DateColumnVector, and TimestampColumnVector. You'll need to replace the functionality of WritableUtils, but that should be pretty easy. If you are trying to get rid of problematic libraries, you really should remove the use of guava from hive-storage too. I thought I had removed them at one point, but they are still there. You can't change the definition of orc-core, which would break compatibility, but you could create a new module (orc-kernel?) that houses the hadoop-clean code. PhysicalWriter already avoids the FileSystem on the write side, you could try making a PhysicalReader for the read path. You'll need to deal with Configuration in a backwards compatible way. All without introducing a performance penalty on the Hadoop users. If your goal is just reducing the size of the jar, it doesn't seem worth the work.
          Hide
          istvan Istvan Szukacs added a comment -

          Thanks Owen,

          This is extremely useful. I think at this stage I am going to fork orc-core and work there, trying to remove all of the Hadoop dependencies. I will let you know how it goes and if it is feasible we can try to get that working with Hadoop as well without breaking anything.

          Show
          istvan Istvan Szukacs added a comment - Thanks Owen, This is extremely useful. I think at this stage I am going to fork orc-core and work there, trying to remove all of the Hadoop dependencies. I will let you know how it goes and if it is feasible we can try to get that working with Hadoop as well without breaking anything.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user l1x commented on the issue:

          https://github.com/apache/orc/pull/96

          Thank you!

          Show
          githubbot ASF GitHub Bot added a comment - Github user l1x commented on the issue: https://github.com/apache/orc/pull/96 Thank you!
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/orc/pull/96

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/orc/pull/96
          Hide
          owen.omalley Owen O'Malley added a comment -

          I committed this.

          Show
          owen.omalley Owen O'Malley added a comment - I committed this.
          Hide
          owen.omalley Owen O'Malley added a comment -

          Released as part of ORC 1.4.0

          Show
          owen.omalley Owen O'Malley added a comment - Released as part of ORC 1.4.0

            People

            • Assignee:
              owen.omalley Owen O'Malley
              Reporter:
              owen.omalley Owen O'Malley
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development