Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-6200

Create a separate jar for hdfs-client

    Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: build
    • Labels:

      Description

      Currently the hadoop-hdfs jar contain both the hdfs server and the hdfs client. As discussed in the hdfs-dev mailing list (http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201404.mbox/browser), downstream projects are forced to bring in additional dependency in order to access hdfs. The additional dependency sometimes can be difficult to manage for projects like Apache Falcon and Apache Oozie.

      This jira proposes to create a new project, hadoop-hdfs-cliient, which contains the client side of the hdfs code. Downstream projects can use this jar instead of the hadoop-hdfs to avoid unnecessary dependency.

      Note that it does not break the compatibility of downstream projects. This is because old downstream projects implicitly depend on hadoop-hdfs-client through the hadoop-hdfs jar.

      1. HDFS-6200.007.patch
        537 kB
        Haohui Mai
      2. HDFS-6200.006.patch
        529 kB
        Haohui Mai
      3. HDFS-6200.005.patch
        481 kB
        Haohui Mai
      4. HDFS-6200.004.patch
        481 kB
        Haohui Mai
      5. HDFS-6200.003.patch
        481 kB
        Haohui Mai
      6. HDFS-6200.002.patch
        480 kB
        Haohui Mai
      7. HDFS-6200.001.patch
        478 kB
        Haohui Mai
      8. HDFS-6200.000.patch
        453 kB
        Haohui Mai

        Issue Links

          Activity

          Hide
          Haohui Mai added a comment -

          Here is the list of dependency when I run mvn dependency:tree in hadoop-hdfs:

          $ mvn dependency:tree|grep -v ":test"
          ...
          [INFO] --- maven-dependency-plugin:2.2:tree (default-cli) @ hadoop-hdfs ---
          [INFO] org.apache.hadoop:hadoop-hdfs:jar:3.0.0-SNAPSHOT
          [INFO] +- org.apache.hadoop:hadoop-annotations:jar:3.0.0-SNAPSHOT:provided
          [INFO] |  \- jdk.tools:jdk.tools:jar:1.8:system
          [INFO] +- org.apache.hadoop:hadoop-auth:jar:3.0.0-SNAPSHOT:provided
          [INFO] |  +- org.slf4j:slf4j-api:jar:1.7.10:provided
          [INFO] |  +- org.apache.httpcomponents:httpclient:jar:4.2.5:provided
          [INFO] |  |  \- org.apache.httpcomponents:httpcore:jar:4.2.5:provided (version managed from 4.2.4)
          [INFO] |  +- org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:provided
          [INFO] |  |  +- org.apache.directory.server:apacheds-i18n:jar:2.0.0-M15:provided
          [INFO] |  |  +- org.apache.directory.api:api-asn1-api:jar:1.0.0-M20:provided
          [INFO] |  |  \- org.apache.directory.api:api-util:jar:1.0.0-M20:provided
          [INFO] |  +- org.apache.zookeeper:zookeeper:jar:3.4.6:provided
          [INFO] |  \- org.apache.curator:curator-framework:jar:2.7.1:provided
          [INFO] +- org.apache.hadoop:hadoop-common:jar:3.0.0-SNAPSHOT:provided
          [INFO] |  +- org.apache.commons:commons-math3:jar:3.1.1:provided
          [INFO] |  +- commons-httpclient:commons-httpclient:jar:3.1:provided
          [INFO] |  +- commons-net:commons-net:jar:3.1:provided
          [INFO] |  +- commons-collections:commons-collections:jar:3.2.1:provided
          [INFO] |  +- javax.servlet.jsp:jsp-api:jar:2.1:provided
          [INFO] |  +- com.sun.jersey:jersey-json:jar:1.9:provided
          [INFO] |  |  +- org.codehaus.jettison:jettison:jar:1.1:provided
          [INFO] |  |  +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:provided
          [INFO] |  |  |  \- javax.xml.bind:jaxb-api:jar:2.2.2:provided
          [INFO] |  |  |     +- javax.xml.stream:stax-api:jar:1.0-2:provided
          [INFO] |  |  |     \- javax.activation:activation:jar:1.1:provided
          [INFO] |  |  +- org.codehaus.jackson:jackson-jaxrs:jar:1.9.13:provided (version managed from 1.8.3)
          [INFO] |  |  \- org.codehaus.jackson:jackson-xc:jar:1.9.13:provided (version managed from 1.8.3)
          [INFO] |  +- net.java.dev.jets3t:jets3t:jar:0.9.0:provided
          [INFO] |  |  \- com.jamesmurty.utils:java-xmlbuilder:jar:0.4:provided
          [INFO] |  +- commons-configuration:commons-configuration:jar:1.6:provided
          [INFO] |  |  +- commons-digester:commons-digester:jar:1.8:provided
          [INFO] |  |  |  \- commons-beanutils:commons-beanutils:jar:1.7.0:provided
          [INFO] |  |  \- commons-beanutils:commons-beanutils-core:jar:1.8.0:provided
          [INFO] |  +- org.apache.avro:avro:jar:1.7.4:provided
          [INFO] |  |  +- com.thoughtworks.paranamer:paranamer:jar:2.3:provided
          [INFO] |  |  \- org.xerial.snappy:snappy-java:jar:1.0.4.1:provided
          [INFO] |  +- com.google.code.gson:gson:jar:2.2.4:provided
          [INFO] |  +- com.jcraft:jsch:jar:0.1.42:provided
          [INFO] |  +- org.apache.curator:curator-client:jar:2.7.1:provided
          [INFO] |  +- org.apache.curator:curator-recipes:jar:2.7.1:provided
          [INFO] |  \- org.apache.commons:commons-compress:jar:1.4.1:provided
          [INFO] |     \- org.tukaani:xz:jar:1.0:provided
          [INFO] +- com.google.guava:guava:jar:11.0.2:compile
          [INFO] |  \- com.google.code.findbugs:jsr305:jar:3.0.0:compile
          [INFO] +- org.mortbay.jetty:jetty:jar:6.1.26:compile
          [INFO] +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile
          [INFO] +- com.sun.jersey:jersey-core:jar:1.9:compile
          [INFO] +- com.sun.jersey:jersey-server:jar:1.9:compile
          [INFO] |  \- asm:asm:jar:3.2:compile (version managed from 3.1)
          [INFO] +- commons-cli:commons-cli:jar:1.2:compile
          [INFO] +- commons-codec:commons-codec:jar:1.4:compile
          [INFO] +- commons-io:commons-io:jar:2.4:compile
          [INFO] +- commons-lang:commons-lang:jar:2.6:compile
          [INFO] +- commons-logging:commons-logging:jar:1.1.3:compile
          [INFO] +- commons-daemon:commons-daemon:jar:1.0.13:compile
          [INFO] +- log4j:log4j:jar:1.2.17:compile
          [INFO] +- com.google.protobuf:protobuf-java:jar:2.5.0:compile
          [INFO] +- javax.servlet:servlet-api:jar:2.5:compile
          [INFO] +- org.slf4j:slf4j-log4j12:jar:1.7.10:provided
          [INFO] +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile
          [INFO] +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile
          [INFO] +- xmlenc:xmlenc:jar:0.52:compile
          [INFO] +- io.netty:netty-all:jar:4.0.23.Final:compile
          [INFO] +- xerces:xercesImpl:jar:2.9.1:compile
          [INFO] |  \- xml-apis:xml-apis:jar:1.3.04:compile
          [INFO] +- org.apache.htrace:htrace-core:jar:3.1.0-incubating:compile
          [INFO] +- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile
          

          As I mentioned earlier I plan to keep the dependency of hadoop-common / hadoop-auth for the first phase, which would allow us to get rid of the following dependency in the client jar:

          [INFO] +- com.google.guava:guava:jar:11.0.2:compile
          [INFO] |  \- com.google.code.findbugs:jsr305:jar:3.0.0:compile
          [INFO] +- org.mortbay.jetty:jetty:jar:6.1.26:compile
          [INFO] +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile
          [INFO] +- com.sun.jersey:jersey-core:jar:1.9:compile
          [INFO] +- com.sun.jersey:jersey-server:jar:1.9:compile
          [INFO] |  \- asm:asm:jar:3.2:compile (version managed from 3.1)
          [INFO] +- commons-cli:commons-cli:jar:1.2:compile
          [INFO] +- commons-codec:commons-codec:jar:1.4:compile
          [INFO] +- commons-io:commons-io:jar:2.4:compile
          [INFO] +- commons-lang:commons-lang:jar:2.6:compile
          [INFO] +- commons-logging:commons-logging:jar:1.1.3:compile
          [INFO] +- commons-daemon:commons-daemon:jar:1.0.13:compile
          [INFO] +- log4j:log4j:jar:1.2.17:compile
          [INFO] +- com.google.protobuf:protobuf-java:jar:2.5.0:compile
          [INFO] +- javax.servlet:servlet-api:jar:2.5:compile
          [INFO] +- xmlenc:xmlenc:jar:0.52:compile
          [INFO] +- io.netty:netty-all:jar:4.0.23.Final:compile
          [INFO] +- xerces:xercesImpl:jar:2.9.1:compile
          [INFO] |  \- xml-apis:xml-apis:jar:1.3.04:compile
          [INFO] +- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile
          

          These dependency will be kept in addition to hadoop-common and hadoop-auth:

          [INFO] +- org.slf4j:slf4j-log4j12:jar:1.7.10:provided -- (for logging)
          [INFO] +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile (parsing JSON for webhdfs)
          [INFO] +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile (might not be needed for parsing JSON for webhdfs, need to double check)
          [INFO] +- org.apache.htrace:htrace-core:jar:3.1.0-incubating:compile (for htrace in DFSClient)
          
          Show
          Haohui Mai added a comment - Here is the list of dependency when I run mvn dependency:tree in hadoop-hdfs : $ mvn dependency:tree|grep -v ":test" ... [INFO] --- maven-dependency-plugin:2.2:tree (default-cli) @ hadoop-hdfs --- [INFO] org.apache.hadoop:hadoop-hdfs:jar:3.0.0-SNAPSHOT [INFO] +- org.apache.hadoop:hadoop-annotations:jar:3.0.0-SNAPSHOT:provided [INFO] | \- jdk.tools:jdk.tools:jar:1.8:system [INFO] +- org.apache.hadoop:hadoop-auth:jar:3.0.0-SNAPSHOT:provided [INFO] | +- org.slf4j:slf4j-api:jar:1.7.10:provided [INFO] | +- org.apache.httpcomponents:httpclient:jar:4.2.5:provided [INFO] | | \- org.apache.httpcomponents:httpcore:jar:4.2.5:provided (version managed from 4.2.4) [INFO] | +- org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:provided [INFO] | | +- org.apache.directory.server:apacheds-i18n:jar:2.0.0-M15:provided [INFO] | | +- org.apache.directory.api:api-asn1-api:jar:1.0.0-M20:provided [INFO] | | \- org.apache.directory.api:api-util:jar:1.0.0-M20:provided [INFO] | +- org.apache.zookeeper:zookeeper:jar:3.4.6:provided [INFO] | \- org.apache.curator:curator-framework:jar:2.7.1:provided [INFO] +- org.apache.hadoop:hadoop-common:jar:3.0.0-SNAPSHOT:provided [INFO] | +- org.apache.commons:commons-math3:jar:3.1.1:provided [INFO] | +- commons-httpclient:commons-httpclient:jar:3.1:provided [INFO] | +- commons-net:commons-net:jar:3.1:provided [INFO] | +- commons-collections:commons-collections:jar:3.2.1:provided [INFO] | +- javax.servlet.jsp:jsp-api:jar:2.1:provided [INFO] | +- com.sun.jersey:jersey-json:jar:1.9:provided [INFO] | | +- org.codehaus.jettison:jettison:jar:1.1:provided [INFO] | | +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:provided [INFO] | | | \- javax.xml.bind:jaxb-api:jar:2.2.2:provided [INFO] | | | +- javax.xml.stream:stax-api:jar:1.0-2:provided [INFO] | | | \- javax.activation:activation:jar:1.1:provided [INFO] | | +- org.codehaus.jackson:jackson-jaxrs:jar:1.9.13:provided (version managed from 1.8.3) [INFO] | | \- org.codehaus.jackson:jackson-xc:jar:1.9.13:provided (version managed from 1.8.3) [INFO] | +- net.java.dev.jets3t:jets3t:jar:0.9.0:provided [INFO] | | \- com.jamesmurty.utils:java-xmlbuilder:jar:0.4:provided [INFO] | +- commons-configuration:commons-configuration:jar:1.6:provided [INFO] | | +- commons-digester:commons-digester:jar:1.8:provided [INFO] | | | \- commons-beanutils:commons-beanutils:jar:1.7.0:provided [INFO] | | \- commons-beanutils:commons-beanutils-core:jar:1.8.0:provided [INFO] | +- org.apache.avro:avro:jar:1.7.4:provided [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:provided [INFO] | | \- org.xerial.snappy:snappy-java:jar:1.0.4.1:provided [INFO] | +- com.google.code.gson:gson:jar:2.2.4:provided [INFO] | +- com.jcraft:jsch:jar:0.1.42:provided [INFO] | +- org.apache.curator:curator-client:jar:2.7.1:provided [INFO] | +- org.apache.curator:curator-recipes:jar:2.7.1:provided [INFO] | \- org.apache.commons:commons-compress:jar:1.4.1:provided [INFO] | \- org.tukaani:xz:jar:1.0:provided [INFO] +- com.google.guava:guava:jar:11.0.2:compile [INFO] | \- com.google.code.findbugs:jsr305:jar:3.0.0:compile [INFO] +- org.mortbay.jetty:jetty:jar:6.1.26:compile [INFO] +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile [INFO] +- com.sun.jersey:jersey-core:jar:1.9:compile [INFO] +- com.sun.jersey:jersey-server:jar:1.9:compile [INFO] | \- asm:asm:jar:3.2:compile (version managed from 3.1) [INFO] +- commons-cli:commons-cli:jar:1.2:compile [INFO] +- commons-codec:commons-codec:jar:1.4:compile [INFO] +- commons-io:commons-io:jar:2.4:compile [INFO] +- commons-lang:commons-lang:jar:2.6:compile [INFO] +- commons-logging:commons-logging:jar:1.1.3:compile [INFO] +- commons-daemon:commons-daemon:jar:1.0.13:compile [INFO] +- log4j:log4j:jar:1.2.17:compile [INFO] +- com.google.protobuf:protobuf-java:jar:2.5.0:compile [INFO] +- javax.servlet:servlet-api:jar:2.5:compile [INFO] +- org.slf4j:slf4j-log4j12:jar:1.7.10:provided [INFO] +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile [INFO] +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile [INFO] +- xmlenc:xmlenc:jar:0.52:compile [INFO] +- io.netty:netty-all:jar:4.0.23.Final:compile [INFO] +- xerces:xercesImpl:jar:2.9.1:compile [INFO] | \- xml-apis:xml-apis:jar:1.3.04:compile [INFO] +- org.apache.htrace:htrace-core:jar:3.1.0-incubating:compile [INFO] +- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile As I mentioned earlier I plan to keep the dependency of hadoop-common / hadoop-auth for the first phase, which would allow us to get rid of the following dependency in the client jar: [INFO] +- com.google.guava:guava:jar:11.0.2:compile [INFO] | \- com.google.code.findbugs:jsr305:jar:3.0.0:compile [INFO] +- org.mortbay.jetty:jetty:jar:6.1.26:compile [INFO] +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile [INFO] +- com.sun.jersey:jersey-core:jar:1.9:compile [INFO] +- com.sun.jersey:jersey-server:jar:1.9:compile [INFO] | \- asm:asm:jar:3.2:compile (version managed from 3.1) [INFO] +- commons-cli:commons-cli:jar:1.2:compile [INFO] +- commons-codec:commons-codec:jar:1.4:compile [INFO] +- commons-io:commons-io:jar:2.4:compile [INFO] +- commons-lang:commons-lang:jar:2.6:compile [INFO] +- commons-logging:commons-logging:jar:1.1.3:compile [INFO] +- commons-daemon:commons-daemon:jar:1.0.13:compile [INFO] +- log4j:log4j:jar:1.2.17:compile [INFO] +- com.google.protobuf:protobuf-java:jar:2.5.0:compile [INFO] +- javax.servlet:servlet-api:jar:2.5:compile [INFO] +- xmlenc:xmlenc:jar:0.52:compile [INFO] +- io.netty:netty-all:jar:4.0.23.Final:compile [INFO] +- xerces:xercesImpl:jar:2.9.1:compile [INFO] | \- xml-apis:xml-apis:jar:1.3.04:compile [INFO] +- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile These dependency will be kept in addition to hadoop-common and hadoop-auth : [INFO] +- org.slf4j:slf4j-log4j12:jar:1.7.10:provided -- (for logging) [INFO] +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile (parsing JSON for webhdfs) [INFO] +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile (might not be needed for parsing JSON for webhdfs, need to double check) [INFO] +- org.apache.htrace:htrace-core:jar:3.1.0-incubating:compile (for htrace in DFSClient)
          Hide
          Alejandro Abdelnur added a comment -

          Haohui,

          Could you please list the actual set of dependencies the hdfs-client will carry?

          Show
          Alejandro Abdelnur added a comment - Haohui, Could you please list the actual set of dependencies the hdfs-client will carry?
          Hide
          Haohui Mai added a comment - - edited

          Thanks tucu. Just to clarify – I'm not trashing the classloader solution, I agree that it has its own values on yarn/mr side. I don't see them as competing solutions, they provide values in different use cases. I think we don't need to mix the two issues.

          Show
          Haohui Mai added a comment - - edited Thanks tucu. Just to clarify – I'm not trashing the classloader solution, I agree that it has its own values on yarn/mr side. I don't see them as competing solutions, they provide values in different use cases. I think we don't need to mix the two issues.
          Hide
          Alejandro Abdelnur added a comment -

          Haohui,

          Doing what hadoop-client wont solve the problems you want to tackle, it will just remove the JARs used on the HDFS server side only. If you just care about those server side dependencies, hadoop-client should be enough and you could exclude YARN/MR artifacts in your dependency.

          If you want take care of guava, commons-*, etc, etc, you'll need to classloader magic for the filesystem impls, and this should be done in common where the Hadoop FileSystem API lives so all Hadoop FileSystem implementations get this kind of isolation.

          Show
          Alejandro Abdelnur added a comment - Haohui, Doing what hadoop-client wont solve the problems you want to tackle, it will just remove the JARs used on the HDFS server side only. If you just care about those server side dependencies, hadoop-client should be enough and you could exclude YARN/MR artifacts in your dependency. If you want take care of guava, commons-*, etc, etc, you'll need to classloader magic for the filesystem impls, and this should be done in common where the Hadoop FileSystem API lives so all Hadoop FileSystem implementations get this kind of isolation.
          Hide
          Haohui Mai added a comment - - edited

          Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a relationship that we'll have to adjust in the future if we e.g. decide that shading the third-party dependencies of hadoop-hdfs-client is the way to go.

          Don't you agree we need a client jar?

          I see you point. This jira, however, is about creating the client jar. Everything below the client jar is implementation detail. I don't think it need to be mixed with this jira.

          Personally, I think having things stay where they are and using maven to build the client artifact will be the easiest to maintain

          I don't agree. We did that for hadoop-client, which is available today. You're more than welcome to contribute and to clean things up. We've been hit really hard on resolving dependency conflicts in Oozie (which uses tomcat's classloader), Ranger (depends on different version of jersey-server), Spark (has a conflicting version of asm). A clean solution to fix all the problems is appreciated.

          Show
          Haohui Mai added a comment - - edited Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a relationship that we'll have to adjust in the future if we e.g. decide that shading the third-party dependencies of hadoop-hdfs-client is the way to go. Don't you agree we need a client jar? I see you point. This jira, however, is about creating the client jar. Everything below the client jar is implementation detail. I don't think it need to be mixed with this jira. Personally, I think having things stay where they are and using maven to build the client artifact will be the easiest to maintain I don't agree. We did that for hadoop-client , which is available today. You're more than welcome to contribute and to clean things up. We've been hit really hard on resolving dependency conflicts in Oozie (which uses tomcat's classloader), Ranger (depends on different version of jersey-server), Spark (has a conflicting version of asm). A clean solution to fix all the problems is appreciated.
          Hide
          Sean Busbey added a comment -

          As I mentioned earlier, the dependencies your client artifact brings with it is a defining part of the interface you are exposing downstream applications to. That means we need the ability to manipulate those dependencies, even if we're only going to do so at a later date. Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a relationship that we'll have to adjust in the future if we e.g. decide that shading the third-party dependencies of hadoop-hdfs-client is the way to go.

          I only mention the internal artifact as an alternative if having DFSClient live in hadoop-hdfs is undesirable. Personally, I think having things stay where they are and using maven to build the client artifact will be the easiest to maintain. However, there might be other mitigating factors I'm not aware of that make breaking the code into a new module desirable.

          Show
          Sean Busbey added a comment - As I mentioned earlier, the dependencies your client artifact brings with it is a defining part of the interface you are exposing downstream applications to. That means we need the ability to manipulate those dependencies, even if we're only going to do so at a later date. Placing hadoop-hdfs-client as a dependency of hadoop-hdfs sets up a relationship that we'll have to adjust in the future if we e.g. decide that shading the third-party dependencies of hadoop-hdfs-client is the way to go. I only mention the internal artifact as an alternative if having DFSClient live in hadoop-hdfs is undesirable. Personally, I think having things stay where they are and using maven to build the client artifact will be the easiest to maintain. However, there might be other mitigating factors I'm not aware of that make breaking the code into a new module desirable.
          Hide
          Haohui Mai added a comment - - edited

          For one, we don't have to worry about what dependencies we bring with us in the internal case because by definition we're in control of both the client interface and the place it's being used.

          In the approach I'm suggesting the original code for the client would still live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. If that is unappealing for some reason, perhaps we should structure things with an internal client artifact. e.g.

          What about (1) hiding implementation in local package when possible? (2) marking it as private class as what we did today when the previous option is unavailable?

          I don't think it is the time to create yet another artifact right now. There are quite a bit of overheads associated with it. I'm yet to see this is justified. If it is indeed required we can do it after hdfs-client is separated out.

          Show
          Haohui Mai added a comment - - edited For one, we don't have to worry about what dependencies we bring with us in the internal case because by definition we're in control of both the client interface and the place it's being used. In the approach I'm suggesting the original code for the client would still live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. If that is unappealing for some reason, perhaps we should structure things with an internal client artifact. e.g. What about (1) hiding implementation in local package when possible? (2) marking it as private class as what we did today when the previous option is unavailable? I don't think it is the time to create yet another artifact right now. There are quite a bit of overheads associated with it. I'm yet to see this is justified. If it is indeed required we can do it after hdfs-client is separated out.
          Hide
          Sean Busbey added a comment -

          The dependencies you bring with you are an integral part of the interface you define for downstream clients. While I agree that it can be a separate subtask, it has to be considered as part of how you structure the overall approach.

          Unfortunately the dependency is a real one – the webhdfs server on DN uses DFSClient to read data from HDFS.

          Our own internal use of client interfaces isn't the same thing as downstream application uses. For one, we don't have to worry about what dependencies we bring with us in the internal case because by definition we're in control of both the client interface and the place it's being used.

          In the approach I'm suggesting the original code for the client would still live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. If that is unappealing for some reason, perhaps we should structure things with an internal client artifact. e.g.

              hadoop-hdfs -- depends on --> hadoop-hdfs-client-internal
              hadoop-hdfs-client -- depends on --> hadoop-hdfs-client-internal
          
          Show
          Sean Busbey added a comment - The dependencies you bring with you are an integral part of the interface you define for downstream clients. While I agree that it can be a separate subtask, it has to be considered as part of how you structure the overall approach. Unfortunately the dependency is a real one – the webhdfs server on DN uses DFSClient to read data from HDFS. Our own internal use of client interfaces isn't the same thing as downstream application uses. For one, we don't have to worry about what dependencies we bring with us in the internal case because by definition we're in control of both the client interface and the place it's being used. In the approach I'm suggesting the original code for the client would still live in hadoop-hdfs, so the webhdfs server would be free to use on DFSClient. If that is unappealing for some reason, perhaps we should structure things with an internal client artifact. e.g. hadoop-hdfs -- depends on --> hadoop-hdfs-client-internal hadoop-hdfs-client -- depends on --> hadoop-hdfs-client-internal
          Hide
          Haohui Mai added a comment - - edited

          we could instead use it to build an aggregate jar with no transitive dependencies

          (since there will presumably be some shaded or otherwise isolated version of third party libraries present).

          This is orthogonal. I don't think we need to mix these issues in this jira.

          we should not make the old hadoop-hdfs depend on it (since there will presumably be some shaded or otherwise isolated version of third party libraries present).

          Unfortunately the dependency is a real one – the webhdfs server on DN uses DFSClient to read data from HDFS.

          Show
          Haohui Mai added a comment - - edited we could instead use it to build an aggregate jar with no transitive dependencies (since there will presumably be some shaded or otherwise isolated version of third party libraries present). This is orthogonal. I don't think we need to mix these issues in this jira. we should not make the old hadoop-hdfs depend on it (since there will presumably be some shaded or otherwise isolated version of third party libraries present). Unfortunately the dependency is a real one – the webhdfs server on DN uses DFSClient to read data from HDFS.
          Hide
          Sean Busbey added a comment -

          Since this new artifact is opt-in (since clients would have to change to it), we could instead use it to build an aggregate jar with no transitive dependencies. For this approach, we should not make the old hadoop-hdfs depend on it (since there will presumably be some shaded or otherwise isolated version of third party libraries present).

          We could still do the move incrementally by relying on maven to build the artifact with just those classes we need from hadoop-hdfs.

          That way, extant downstream applications who want to keep the current behavior can keep depending on hadoop-hdfs (or hadoop-client or whatever), and downstream applications who want the improved client dependency can change. When we're ready for a breaking change, we similarly announce that downstream applications should not be relying on hadoop-hdfs.

          Show
          Sean Busbey added a comment - Since this new artifact is opt-in (since clients would have to change to it), we could instead use it to build an aggregate jar with no transitive dependencies. For this approach, we should not make the old hadoop-hdfs depend on it (since there will presumably be some shaded or otherwise isolated version of third party libraries present). We could still do the move incrementally by relying on maven to build the artifact with just those classes we need from hadoop-hdfs. That way, extant downstream applications who want to keep the current behavior can keep depending on hadoop-hdfs (or hadoop-client or whatever), and downstream applications who want the improved client dependency can change. When we're ready for a breaking change, we similarly announce that downstream applications should not be relying on hadoop-hdfs.
          Hide
          Haohui Mai added a comment -

          Here is the proposal for the first step:

          Summary: (1) the changes are backward compatible, and (2) the changes will be done in an incremental way to minimize risks.

          • Update the pom.xml to create a new module hadoop-hdfs-client and publish it into maven repository
          • The old hadoop-hdfs jar depends on the hadoop-hdfs-client jar. There should be no changes for downstream applications.
          • Move the client implementation from hadoop-hdfs to hadoop-hdfs-client incrementally. This can be done in trunk and be reviewed.
          • Once the move is finished, we announce that the applications can depend on hadoop-hdfs-client only.
          • In this jira we left hadoop-common untouched. We'll take care of it in a separate jira.

          Thoughts?

          Show
          Haohui Mai added a comment - Here is the proposal for the first step: Summary: (1) the changes are backward compatible, and (2) the changes will be done in an incremental way to minimize risks. Update the pom.xml to create a new module hadoop-hdfs-client and publish it into maven repository The old hadoop-hdfs jar depends on the hadoop-hdfs-client jar. There should be no changes for downstream applications. Move the client implementation from hadoop-hdfs to hadoop-hdfs-client incrementally. This can be done in trunk and be reviewed. Once the move is finished, we announce that the applications can depend on hadoop-hdfs-client only. In this jira we left hadoop-common untouched. We'll take care of it in a separate jira. Thoughts?
          Hide
          Sanjay Radia added a comment -

          +++1 for this proposal.

          Show
          Sanjay Radia added a comment - +++1 for this proposal.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          +1000 for this proposal! (Not looked at the patch)
          Reproducing my comments at HADOOP-11656.

          • Having a separate hdfs client JAR would vastly reduce the amount of classpath conflicts. We have seen that in practice when we moved from Hadoop-1 MR to YARN having a leaner client JAR avoided a whole lot of problems we had before even if wasn't perfectly done.
          • A lean client JAR is also a major help in how we rationalize stack wide rolling upgrades - today NameNode is on the classpath of ResourceManager and RegionServer even if it doesn't get used and so it is very hard to layout and upgrade bits easily.
          Show
          Vinod Kumar Vavilapalli added a comment - +1000 for this proposal! (Not looked at the patch) Reproducing my comments at HADOOP-11656 . Having a separate hdfs client JAR would vastly reduce the amount of classpath conflicts. We have seen that in practice when we moved from Hadoop-1 MR to YARN having a leaner client JAR avoided a whole lot of problems we had before even if wasn't perfectly done. A lean client JAR is also a major help in how we rationalize stack wide rolling upgrades - today NameNode is on the classpath of ResourceManager and RegionServer even if it doesn't get used and so it is very hard to layout and upgrade bits easily.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12651333/HDFS-6200.007.patch
          against trunk revision e9ac88a.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9692//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12651333/HDFS-6200.007.patch against trunk revision e9ac88a. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9692//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12651333/HDFS-6200.007.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7174//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12651333/HDFS-6200.007.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7174//console This message is automatically generated.
          Hide
          Haohui Mai added a comment -

          DFSConfigKeys should be public and contain those keys clients are expected to use
          this is a good time to switch to SLF4J for the logging here, and drop commons-logging
          HdfsFileStatus gets its "public final" declarations in the wrong order ... again, this is a good time to fix it.
          JsonUtilClient uses org.mortbay.util.ajax.JSON to parse the json. Thus should be replaced by Jackson, so we don't need the mortbay libs on the classpath?
          WebHdfsFileSystem also uses mortbay JSON for parsing

          It might be better to restrict this patch to moving the files only. I plan to address them in separate jiras. Filed HDFS-6564, HDFS-6565, HDFS-6566, and HDFS-6567 to track them.

          Show
          Haohui Mai added a comment - DFSConfigKeys should be public and contain those keys clients are expected to use this is a good time to switch to SLF4J for the logging here, and drop commons-logging HdfsFileStatus gets its "public final" declarations in the wrong order ... again, this is a good time to fix it. JsonUtilClient uses org.mortbay.util.ajax.JSON to parse the json. Thus should be replaced by Jackson, so we don't need the mortbay libs on the classpath? WebHdfsFileSystem also uses mortbay JSON for parsing It might be better to restrict this patch to moving the files only. I plan to address them in separate jiras. Filed HDFS-6564 , HDFS-6565 , HDFS-6566 , and HDFS-6567 to track them.
          Hide
          Steve Loughran added a comment -
          1. DFSConfigKeys should be public and contain those keys clients are expected to use
          2. this is a good time to switch to SLF4J for the logging here, and drop commons-logging
          3. HdfsFileStatus gets its "public final" declarations in the wrong order ... again, this is a good time to fix it.
          1. JsonUtilClient uses org.mortbay.util.ajax.JSON to parse the json. Thus should be replaced by Jackson, so we don't need the mortbay libs on the classpath?
          2. WebHdfsFileSystem also uses mortbay JSON for parsing
          Show
          Steve Loughran added a comment - DFSConfigKeys should be public and contain those keys clients are expected to use this is a good time to switch to SLF4J for the logging here, and drop commons-logging HdfsFileStatus gets its "public final" declarations in the wrong order ... again, this is a good time to fix it. JsonUtilClient uses org.mortbay.util.ajax.JSON to parse the json. Thus should be replaced by Jackson, so we don't need the mortbay libs on the classpath? WebHdfsFileSystem also uses mortbay JSON for parsing
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12651236/HDFS-6200.006.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal:

          org.apache.hadoop.hdfs.web.TestWebHDFSAcl

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7164//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7164//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12651236/HDFS-6200.006.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal: org.apache.hadoop.hdfs.web.TestWebHDFSAcl +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7164//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7164//console This message is automatically generated.
          Hide
          Haohui Mai added a comment -

          Rebased on to the latest trunk.

          Show
          Haohui Mai added a comment - Rebased on to the latest trunk.
          Hide
          Haohui Mai added a comment -

          Tsz Wo Nicholas Sze, Alejandro Abdelnur, and Steve Loughran, can you please take a look at this patch?

          Show
          Haohui Mai added a comment - Tsz Wo Nicholas Sze , Alejandro Abdelnur , and Steve Loughran , can you please take a look at this patch?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12639158/HDFS-6200.005.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6618//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6618//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639158/HDFS-6200.005.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6618//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6618//console This message is automatically generated.
          Hide
          Haohui Mai added a comment -

          The v5 patch fixes the audit warning.

          Show
          Haohui Mai added a comment - The v5 patch fixes the audit warning.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12639135/HDFS-6200.004.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 1 release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6615//testReport/
          Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/6615//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6615//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639135/HDFS-6200.004.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit . The applied patch generated 1 release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6615//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/6615//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6615//console This message is automatically generated.
          Hide
          Haohui Mai added a comment -

          Rebased

          Show
          Haohui Mai added a comment - Rebased
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12639133/HDFS-6200.003.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6614//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639133/HDFS-6200.003.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6614//console This message is automatically generated.
          Hide
          Haohui Mai added a comment -

          The v3 patch fixes the findbugs and the unit test issues.

          Show
          Haohui Mai added a comment - The v3 patch fixes the findbugs and the unit test issues.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12639104/HDFS-6200.002.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal:

          org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6612//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6612//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639104/HDFS-6200.002.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal: org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6612//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6612//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12639097/HDFS-6200.001.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6611//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639097/HDFS-6200.001.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6611//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12639093/HDFS-6200.000.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6610//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639093/HDFS-6200.000.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6610//console This message is automatically generated.
          Hide
          Haohui Mai added a comment -

          The v0 patch moves WebHdfsFileSystem and SWebHdfsFileSystem into a separate jar. The patch moves the files to a different project except for the following:

          1. It modifies DFSConfigKeys so that it no longer depends on the AuthFilter and the BlockPlacementPolicyDefault class.
          2. It moves some methods from the HAUtil / DFSUtilClient in the original hdfs jar to the HAUtilClient / DFSUtilClient in the client side jar.
          3. It moves some methods from the JsonUtils in the original hdfs jar to the JsonUtilClient in the client side jar. It contains a new function that translates JSON representation of the BlockLocation object to the Java object.

          To review this patch, one can use git diff -M to omit the renames.

          Show
          Haohui Mai added a comment - The v0 patch moves WebHdfsFileSystem and SWebHdfsFileSystem into a separate jar. The patch moves the files to a different project except for the following: It modifies DFSConfigKeys so that it no longer depends on the AuthFilter and the BlockPlacementPolicyDefault class. It moves some methods from the HAUtil / DFSUtilClient in the original hdfs jar to the HAUtilClient / DFSUtilClient in the client side jar. It moves some methods from the JsonUtils in the original hdfs jar to the JsonUtilClient in the client side jar. It contains a new function that translates JSON representation of the BlockLocation object to the Java object. To review this patch, one can use git diff -M to omit the renames.

            People

            • Assignee:
              Haohui Mai
              Reporter:
              Haohui Mai
            • Votes:
              0 Vote for this issue
              Watchers:
              24 Start watching this issue

              Dates

              • Created:
                Updated:

                Development