Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.1
    • Fix Version/s: 3.0.0
    • Component/s: fs
    • Labels:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      HDFS Router-based Federation adds a RPC routing layer that provides a federated view of multiple HDFS namespaces.
      This is similar to the existing ViewFS and HDFS federation functionality, except the mount table is managed on the server-side by the routing layer rather than on the client.
      This simplifies access to a federated cluster for existing HDFS clients.

      See HDFS-10467 and the HDFS Router-based Federation documentation for more details.
      Show
      HDFS Router-based Federation adds a RPC routing layer that provides a federated view of multiple HDFS namespaces. This is similar to the existing ViewFS and HDFS federation functionality, except the mount table is managed on the server-side by the routing layer rather than on the client. This simplifies access to a federated cluster for existing HDFS clients. See HDFS-10467 and the HDFS Router-based Federation documentation for more details.

      Description

      Add a Router to provide a federated view of multiple HDFS clusters.

      1. HDFS Router Federation.pdf
        944 kB
        Íñigo Goiri
      2. HDFS-Router-Federation-Prototype.patch
        616 kB
        Íñigo Goiri
      3. HDFS-10467.PoC.patch
        664 kB
        Íñigo Goiri
      4. HDFS-10467.PoC.001.patch
        725 kB
        Íñigo Goiri
      5. HDFS-10467.002.patch
        897 kB
        Íñigo Goiri

        Issue Links

        There are no Sub-Tasks for this issue.

          Activity

          Hide
          elgoiri Íñigo Goiri added a comment -

          First draft with the Router-based HDFS federation.

          Show
          elgoiri Íñigo Goiri added a comment - First draft with the Router-based HDFS federation.
          Hide
          elgoiri Íñigo Goiri added a comment -

          The advantages of this approach are:

          • Transparent to the users: A user can use the regular HDFS client as the Router looks like a regular Namenode. No maintenance of the mount table, etc.
          • Transparent rebalancing of subclusters: With this design, we can move data between subclusters and hide it behind the Router.
          • No additional changes to current HDFS: The Routers and State Store are completely independent from current NNs and DNs. No additional changes to the current HDFS code (we could add some functionality to the NN for some minor performance improvement but not needed).

          We have a prototype running with 4 subclusters and a couple implementations for the State Store side. Once we agree with the design, I can start creating the subtasks.

          Show
          elgoiri Íñigo Goiri added a comment - The advantages of this approach are: Transparent to the users: A user can use the regular HDFS client as the Router looks like a regular Namenode. No maintenance of the mount table, etc. Transparent rebalancing of subclusters: With this design, we can move data between subclusters and hide it behind the Router. No additional changes to current HDFS: The Routers and State Store are completely independent from current NNs and DNs. No additional changes to the current HDFS code (we could add some functionality to the NN for some minor performance improvement but not needed). We have a prototype running with 4 subclusters and a couple implementations for the State Store side. Once we agree with the design, I can start creating the subtasks.
          Hide
          cnauroth Chris Nauroth added a comment -

          Íñigo Goiri, thank you for sharing this. A similar discussion came up recently on the hdfs-dev@hadoop.apache.org mailing list, so it appears you are not alone in this requirement.

          http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201605.mbox/%3C1462210332.1520687.595811233.2B297F6A%40webmail.messagingengine.com%3E

          Show
          cnauroth Chris Nauroth added a comment - Íñigo Goiri , thank you for sharing this. A similar discussion came up recently on the hdfs-dev@hadoop.apache.org mailing list, so it appears you are not alone in this requirement. http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201605.mbox/%3C1462210332.1520687.595811233.2B297F6A%40webmail.messagingengine.com%3E
          Hide
          elgoiri Íñigo Goiri added a comment -

          This approach should support the scenarios in the mail thread. Right now we support the most typical operations of the RPC interface and the basic ones for REST. We don't do proxying of the requests to the DNs as HttpFs does but at some point we might; for now we just point HttpFs to our Routers. Instead of extending HttpFs we went into mimicing the NN to provide the full interface (i.e., RPC) and mimic the image of a big large NN.

          The main difference with the approach proposed in the mail thread is the addition of the State Store as a centralized storage for the federation state. This mimics the architecture from YARN federation in YARN-2915.

          If the document is not enough, I can provide a patch with the full approach (including Router, State Store and a simple cluster rebalancer). I think later on, we should split it into multiple subtasks.

          Show
          elgoiri Íñigo Goiri added a comment - This approach should support the scenarios in the mail thread. Right now we support the most typical operations of the RPC interface and the basic ones for REST. We don't do proxying of the requests to the DNs as HttpFs does but at some point we might; for now we just point HttpFs to our Routers. Instead of extending HttpFs we went into mimicing the NN to provide the full interface (i.e., RPC) and mimic the image of a big large NN. The main difference with the approach proposed in the mail thread is the addition of the State Store as a centralized storage for the federation state. This mimics the architecture from YARN federation in YARN-2915 . If the document is not enough, I can provide a patch with the full approach (including Router, State Store and a simple cluster rebalancer). I think later on, we should split it into multiple subtasks.
          Hide
          elgoiri Íñigo Goiri added a comment -

          He Tianyi for awareness.

          Show
          elgoiri Íñigo Goiri added a comment - He Tianyi for awareness.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Thanks for early feedback to Chris Douglas, Gera Shegalov, and Subru Krishnan.

          Show
          elgoiri Íñigo Goiri added a comment - Thanks for early feedback to Chris Douglas , Gera Shegalov , and Subru Krishnan .
          Hide
          He Tianyi He Tianyi added a comment -

          Thanks for sharing.

          I've implemented similar approach as a separated project, see https://github.com/bytedance/nnproxy.
          I am currently using it for backing 2 namenodes with a mount table with 20+ entries in production and worked well. (about 12K TPS)

          Looks like HDFS Router Federation includes more features. Shall we work together?

          Show
          He Tianyi He Tianyi added a comment - Thanks for sharing. I've implemented similar approach as a separated project, see https://github.com/bytedance/nnproxy . I am currently using it for backing 2 namenodes with a mount table with 20+ entries in production and worked well. (about 12K TPS) Looks like HDFS Router Federation includes more features. Shall we work together?
          Hide
          elgoiri Íñigo Goiri added a comment -

          He Tianyi, our proposal requires additional components (State Store and Router) so it might be a little too complex for what you want.
          Let me post a patch with our prototype during the week and if it sounds reasonable to you, you can decide whether to merge efforts.

          Show
          elgoiri Íñigo Goiri added a comment - He Tianyi , our proposal requires additional components (State Store and Router) so it might be a little too complex for what you want. Let me post a patch with our prototype during the week and if it sounds reasonable to you, you can decide whether to merge efforts.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Prototype on top of Hadoop 2.6 to get an idea of our proposal. Missing the ZK and HDFS backed State Store.

          Show
          elgoiri Íñigo Goiri added a comment - Prototype on top of Hadoop 2.6 to get an idea of our proposal. Missing the ZK and HDFS backed State Store.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Another advantage about the proposed approach is that the Routers take care of the fail over of the Namenodes so it simplifies that in the Client side.
          Indirectly, this is using the approach that leverages centralized information which was discarded in HDFS-7858.
          This approach was discarded in that context but I think this is reasonable to do here.

          Show
          elgoiri Íñigo Goiri added a comment - Another advantage about the proposed approach is that the Routers take care of the fail over of the Namenodes so it simplifies that in the Client side. Indirectly, this is using the approach that leverages centralized information which was discarded in HDFS-7858 . This approach was discarded in that context but I think this is reasonable to do here.
          Hide
          He Tianyi He Tianyi added a comment -

          +1. This also reduces latency for first request from client (no failover on client-side, and router can memorize current active peer).

          Show
          He Tianyi He Tianyi added a comment - +1. This also reduces latency for first request from client (no failover on client-side, and router can memorize current active peer).
          Hide
          zhz Zhe Zhang added a comment - - edited

          Thanks for the design doc and patch Íñigo Goiri, very interesting work.

          A quick suggestion on the PoC patch first: it doesn't really apply on branch-2.6. Could you attach either a link to your PoC github branch or a PoC patch based on some stable branch? This way people (including myself) can view it as a working PoC project. The dependency on sql server also seems to be causing trouble in building.

          Show
          zhz Zhe Zhang added a comment - - edited Thanks for the design doc and patch Íñigo Goiri , very interesting work. A quick suggestion on the PoC patch first: it doesn't really apply on branch-2.6. Could you attach either a link to your PoC github branch or a PoC patch based on some stable branch? This way people (including myself) can view it as a working PoC project. The dependency on sql server also seems to be causing trouble in building.
          Hide
          elgoiri Íñigo Goiri added a comment - - edited

          Zhe Zhang, true, it was done on our internal 2.6 branch.
          I'll prepare a patch for trunk and disable the SQL driver by tomorrow.
          Let me know if some branch other than trunk is better as a base.

          Show
          elgoiri Íñigo Goiri added a comment - - edited Zhe Zhang , true, it was done on our internal 2.6 branch. I'll prepare a patch for trunk and disable the SQL driver by tomorrow. Let me know if some branch other than trunk is better as a base.
          Hide
          zhz Zhe Zhang added a comment -

          Thanks Íñigo Goiri! I think most likely we will cut a feature branch for this work. So rebasing the patch on trunk won't be wasted effort.

          After that, I suggest you push trunk + your PoC to a personal github branch. Otherwise, trunk itself is a moving target and it will be hard to apply and evaluate the PoC patch again.

          Show
          zhz Zhe Zhang added a comment - Thanks Íñigo Goiri ! I think most likely we will cut a feature branch for this work. So rebasing the patch on trunk won't be wasted effort. After that, I suggest you push trunk + your PoC to a personal github branch. Otherwise, trunk itself is a moving target and it will be hard to apply and evaluate the PoC patch again.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Prototype on trunk (Not fully tested though).

          Show
          elgoiri Íñigo Goiri added a comment - Prototype on trunk (Not fully tested though).
          Hide
          elgoiri Íñigo Goiri added a comment -

          I went through the rebase into trunk and there are just a couple changes in Server, Client, and a couple related classes.
          It should be easy to keep rebasing the patch as needed.

          I haven't been able to fully test it on trunk yet but we'll go over it during the day.

          Show
          elgoiri Íñigo Goiri added a comment - I went through the rebase into trunk and there are just a couple changes in Server , Client , and a couple related classes. It should be easy to keep rebasing the patch as needed. I haven't been able to fully test it on trunk yet but we'll go over it during the day.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
          0 shelldocs 0m 1s Shelldocs was not available.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 14 new or modified test files.
          0 mvndep 0m 15s Maven dependency ordering for branch
          +1 mvninstall 7m 38s trunk passed
          +1 compile 8m 25s trunk passed
          +1 checkstyle 1m 39s trunk passed
          +1 mvnsite 1m 51s trunk passed
          +1 mvneclipse 0m 29s trunk passed
          +1 findbugs 3m 5s trunk passed
          +1 javadoc 2m 3s trunk passed
          0 mvndep 0m 14s Maven dependency ordering for patch
          +1 mvninstall 1m 40s the patch passed
          +1 compile 6m 50s the patch passed
          +1 cc 6m 50s the patch passed
          -1 javac 6m 50s root generated 4 new + 695 unchanged - 2 fixed = 699 total (was 697)
          -1 checkstyle 1m 49s root: The patch generated 609 new + 1185 unchanged - 5 fixed = 1794 total (was 1190)
          +1 mvnsite 1m 49s the patch passed
          +1 mvneclipse 0m 27s the patch passed
          -1 shellcheck 0m 12s The patch generated 4 new + 80 unchanged - 1 fixed = 84 total (was 81)
          -1 whitespace 0m 0s The patch has 48 line(s) that end in whitespace. Use git apply --whitespace=fix.
          -1 whitespace 0m 2s The patch 37 line(s) with tabs.
          +1 xml 0m 3s The patch has no ill-formed XML file.
          -1 findbugs 1m 56s hadoop-hdfs-project/hadoop-hdfs generated 44 new + 0 unchanged - 0 fixed = 44 total (was 0)
          -1 javadoc 1m 5s hadoop-hdfs-project_hadoop-hdfs generated 65 new + 7 unchanged - 0 fixed = 72 total (was 7)
          +1 unit 16m 15s hadoop-common in the patch passed.
          -1 unit 83m 50s hadoop-hdfs in the patch failed.
          -1 asflicense 0m 30s The patch generated 1 ASF License warnings.
          145m 55s



          Reason Tests
          FindBugs module:hadoop-hdfs-project/hadoop-hdfs
            Unread field:StateStoreMetrics.java:[line 55]
            org.apache.hadoop.hdfs.server.federation.router.FederationConnectionId doesn't override org.apache.hadoop.ipc.Client$ConnectionId.equals(Object) At FederationConnectionId.java:At FederationConnectionId.java:[line 1]
            org.apache.hadoop.hdfs.server.federation.router.FederationUtil.getJmx(String, String) may fail to close stream At FederationUtil.java:close stream At FederationUtil.java:[line 48]
            Should org.apache.hadoop.hdfs.server.federation.router.NamenodeHearbeatService$NamenodeStatusReport be a static inner class? At NamenodeHearbeatService.java:inner class? At NamenodeHearbeatService.java:[lines 91-141]
            Unread public/protected field:At NamenodeHearbeatService.java:[line 95]
            org.apache.hadoop.hdfs.server.federation.router.Router.initAndStartRouter(Configuration, boolean) invokes System.exit(...), which shuts down the entire virtual machine At Router.java:shuts down the entire virtual machine At Router.java:[line 551]
            Exception is caught when Exception is not thrown in org.apache.hadoop.hdfs.server.federation.router.Router.getConnection(MembershipStateEntity) At Router.java:is not thrown in org.apache.hadoop.hdfs.server.federation.router.Router.getConnection(MembershipStateEntity) At Router.java:[line 931]
            org.apache.hadoop.hdfs.server.federation.router.Router$CustomSocketFactory doesn't override org.apache.hadoop.net.StandardSocketFactory.equals(Object) At Router.java:At Router.java:[line 1]
            Should org.apache.hadoop.hdfs.server.federation.router.Router$CustomSocketFactory be a static inner class? At Router.java:inner class? At Router.java:[lines 353-361]
            instanceof will always return false in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(String, byte[], boolean), since a RuntimeException can't be a java.io.IOException At RouterRpcServer.java:in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(String, byte[], boolean), since a RuntimeException can't be a java.io.IOException At RouterRpcServer.java:[line 794]
            Redundant nullcheck of connection, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer$DefaultLookup.getClient() Redundant null check at RouterRpcServer.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer$DefaultLookup.getClient() Redundant null check at RouterRpcServer.java:[line 925]
            Write to static field org.apache.hadoop.hdfs.server.federation.statestore.model.MembershipStateEntity._expirationMs from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java:from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java:[line 164]
            Write to static field org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity._expirationMs from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java:from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java:[line 174]
            Write to static field org.apache.hadoop.hdfs.server.federation.statestore.model.RouterEntity._expirationMs from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java:from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java:[line 169]
            org.apache.hadoop.hdfs.server.federation.statestore.StateStore.overrideRegistration(String, String, MembershipStateEntity$FederationNamenodeServiceState) does not release lock on all exception paths At StateStore.java:release lock on all exception paths At StateStore.java:[line 848]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.MountTableManager.getDestination(String) does not release lock on all exception paths At MountTableManager.java:on all exception paths At MountTableManager.java:[line 107]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.MountTableManager$MountTableTreeNode.toString(int) concatenates strings using + in a loop At MountTableManager.java:in a loop At MountTableManager.java:[line 331]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient Boolean constructor; use Boolean.valueOf(...) instead At StateStoreDriver.java:constructor; use Boolean.valueOf(...) instead At StateStoreDriver.java:[line 411]
            Primitive boxed just to call toString in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) At StateStoreDriver.java:toString in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) At StateStoreDriver.java:[line 382]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient new Integer(String) constructor; use Integer.valueOf(String) instead At StateStoreDriver.java:Integer(String) constructor; use Integer.valueOf(String) instead At StateStoreDriver.java:[line 409]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient new Long(String) constructor; use Long.valueOf(String) instead At StateStoreDriver.java:Long(String) constructor; use Long.valueOf(String) instead At StateStoreDriver.java:[line 407]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) invokes inefficient new Long(long) constructor; use Long.valueOf(long) instead At StateStoreDriver.java:Long(long) constructor; use Long.valueOf(long) instead At StateStoreDriver.java:[line 382]
            Redundant nullcheck of results, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.getSingle(Class, Map) Redundant null check at StateStoreDriver.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.getSingle(Class, Map) Redundant null check at StateStoreDriver.java:[line 204]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.filterMultiple(Map, Iterable) makes inefficient use of keySet iterator instead of entrySet iterator At StateStoreDriver.java:of keySet iterator instead of entrySet iterator At StateStoreDriver.java:[line 120]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.readAllLinesFromFile(String):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.readAllLinesFromFile(String): new java.io.FileReader(File) At StateStoreFile.java:[line 191]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.writeAll(Collection, Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.writeAll(Collection, Class): new java.io.FileWriter(File, boolean) At StateStoreFile.java:[line 276]
            Exceptional return value of java.io.File.createNewFile() ignored in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.init(Configuration, StateStore) At StateStoreFile.java:ignored in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.init(Configuration, StateStore) At StateStoreFile.java:[line 118]
            Exceptional return value of java.io.File.mkdirs() ignored in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.init(Configuration, StateStore) At StateStoreFile.java:ignored in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.init(Configuration, StateStore) At StateStoreFile.java:[line 106]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.getAll(Class) does not release lock on all exception paths At StateStoreFile.java:on all exception paths At StateStoreFile.java:[line 241]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.writeAll(Collection, Class) does not release lock on all exception paths At StateStoreFile.java:lock on all exception paths At StateStoreFile.java:[line 272]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getAll(Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getAll(Class): new java.io.InputStreamReader(InputStream) At StateStoreFileSystem.java:[line 113]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.writeAll(Collection, Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.writeAll(Collection, Class): new java.io.OutputStreamWriter(OutputStream) At StateStoreFileSystem.java:[line 163]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getAll(Class) may fail to close stream At StateStoreFileSystem.java:stream At StateStoreFileSystem.java:[line 114]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.writeAll(Collection, Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.writeAll(Collection, Class): String.getBytes() At StateStoreZooKeeper.java:[line 402]
            Inconsistent synchronization of org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.zkAuths; locked 50% of time Unsynchronized access at StateStoreZooKeeper.java:50% of time Unsynchronized access at StateStoreZooKeeper.java:[line 127]
            Inconsistent synchronization of org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.zkHost; locked 50% of time Unsynchronized access at StateStoreZooKeeper.java:50% of time Unsynchronized access at StateStoreZooKeeper.java:[line 123]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData.getData():in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData.getData(): new String(byte[]) At StateStoreZooKeeper.java:[line 285]
            Should org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData be a static inner class? At StateStoreZooKeeper.java:inner class? At StateStoreZooKeeper.java:[lines 280-285]
            Exception is caught when Exception is not thrown in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.serialize(StateStoreDriver) At BaseEntity.java:is not thrown in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.serialize(StateStoreDriver) At BaseEntity.java:[line 325]
            org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.generateCacheKey(Map) makes inefficient use of keySet iterator instead of entrySet iterator At BaseEntity.java:keySet iterator instead of entrySet iterator At BaseEntity.java:[line 179]
            Nullcheck of BaseEntity.dateModified at line 222 of value previously dereferenced in org.apache.hadoop.hdfs.server.federation.statestore.model.MembershipStateEntity.inflate(long) At MembershipStateEntity.java:222 of value previously dereferenced in org.apache.hadoop.hdfs.server.federation.statestore.model.MembershipStateEntity.inflate(long) At MembershipStateEntity.java:[line 220]
            Invocation of toString on components in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:[line 157]
            Invocation of toString on subComponents in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:[line 167]
            new org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity(String, Map) makes inefficient use of keySet iterator instead of entrySet iterator At MountTableEntity.java:use of keySet iterator instead of entrySet iterator At MountTableEntity.java:[line 80]
          Failed junit tests hadoop.hdfs.server.federation.statestore.TestMountTable
            hadoop.tools.TestHdfsConfigFields
            hadoop.hdfs.server.namenode.TestEditLog
            hadoop.hdfs.TestAsyncHDFSWithHA
          Timed out junit tests org.apache.hadoop.hdfs.server.federation.router.TestRouterSafemode
            org.apache.hadoop.hdfs.server.federation.router.TestRouterJmx
            org.apache.hadoop.hdfs.server.federation.statestore.TestStateStore
            org.apache.hadoop.hdfs.server.federation.router.TestRouterAdmin
            org.apache.hadoop.hdfs.server.federation.statestore.TestNamenodeRegistration
            org.apache.hadoop.hdfs.server.federation.router.TestRouterRpc



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:2c91fd8
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12808728/HDFS-10467.PoC.patch
          JIRA Issue HDFS-10467
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml shellcheck shelldocs cc
          uname Linux aef25500fe88 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / be34e85
          Default Java 1.8.0_91
          shellcheck v0.4.4
          findbugs v3.0.0
          javac https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/diff-compile-javac-root.txt
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/diff-checkstyle-root.txt
          shellcheck https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/diff-patch-shellcheck.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/whitespace-eol.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/whitespace-tabs.txt
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
          javadoc https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/diff-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15692/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: .
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15692/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. 0 shelldocs 0m 1s Shelldocs was not available. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 14 new or modified test files. 0 mvndep 0m 15s Maven dependency ordering for branch +1 mvninstall 7m 38s trunk passed +1 compile 8m 25s trunk passed +1 checkstyle 1m 39s trunk passed +1 mvnsite 1m 51s trunk passed +1 mvneclipse 0m 29s trunk passed +1 findbugs 3m 5s trunk passed +1 javadoc 2m 3s trunk passed 0 mvndep 0m 14s Maven dependency ordering for patch +1 mvninstall 1m 40s the patch passed +1 compile 6m 50s the patch passed +1 cc 6m 50s the patch passed -1 javac 6m 50s root generated 4 new + 695 unchanged - 2 fixed = 699 total (was 697) -1 checkstyle 1m 49s root: The patch generated 609 new + 1185 unchanged - 5 fixed = 1794 total (was 1190) +1 mvnsite 1m 49s the patch passed +1 mvneclipse 0m 27s the patch passed -1 shellcheck 0m 12s The patch generated 4 new + 80 unchanged - 1 fixed = 84 total (was 81) -1 whitespace 0m 0s The patch has 48 line(s) that end in whitespace. Use git apply --whitespace=fix. -1 whitespace 0m 2s The patch 37 line(s) with tabs. +1 xml 0m 3s The patch has no ill-formed XML file. -1 findbugs 1m 56s hadoop-hdfs-project/hadoop-hdfs generated 44 new + 0 unchanged - 0 fixed = 44 total (was 0) -1 javadoc 1m 5s hadoop-hdfs-project_hadoop-hdfs generated 65 new + 7 unchanged - 0 fixed = 72 total (was 7) +1 unit 16m 15s hadoop-common in the patch passed. -1 unit 83m 50s hadoop-hdfs in the patch failed. -1 asflicense 0m 30s The patch generated 1 ASF License warnings. 145m 55s Reason Tests FindBugs module:hadoop-hdfs-project/hadoop-hdfs   Unread field:StateStoreMetrics.java: [line 55]   org.apache.hadoop.hdfs.server.federation.router.FederationConnectionId doesn't override org.apache.hadoop.ipc.Client$ConnectionId.equals(Object) At FederationConnectionId.java:At FederationConnectionId.java: [line 1]   org.apache.hadoop.hdfs.server.federation.router.FederationUtil.getJmx(String, String) may fail to close stream At FederationUtil.java:close stream At FederationUtil.java: [line 48]   Should org.apache.hadoop.hdfs.server.federation.router.NamenodeHearbeatService$NamenodeStatusReport be a static inner class? At NamenodeHearbeatService.java:inner class? At NamenodeHearbeatService.java: [lines 91-141]   Unread public/protected field:At NamenodeHearbeatService.java: [line 95]   org.apache.hadoop.hdfs.server.federation.router.Router.initAndStartRouter(Configuration, boolean) invokes System.exit(...), which shuts down the entire virtual machine At Router.java:shuts down the entire virtual machine At Router.java: [line 551]   Exception is caught when Exception is not thrown in org.apache.hadoop.hdfs.server.federation.router.Router.getConnection(MembershipStateEntity) At Router.java:is not thrown in org.apache.hadoop.hdfs.server.federation.router.Router.getConnection(MembershipStateEntity) At Router.java: [line 931]   org.apache.hadoop.hdfs.server.federation.router.Router$CustomSocketFactory doesn't override org.apache.hadoop.net.StandardSocketFactory.equals(Object) At Router.java:At Router.java: [line 1]   Should org.apache.hadoop.hdfs.server.federation.router.Router$CustomSocketFactory be a static inner class? At Router.java:inner class? At Router.java: [lines 353-361]   instanceof will always return false in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(String, byte[], boolean), since a RuntimeException can't be a java.io.IOException At RouterRpcServer.java:in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getListing(String, byte[], boolean), since a RuntimeException can't be a java.io.IOException At RouterRpcServer.java: [line 794]   Redundant nullcheck of connection, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer$DefaultLookup.getClient() Redundant null check at RouterRpcServer.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer$DefaultLookup.getClient() Redundant null check at RouterRpcServer.java: [line 925]   Write to static field org.apache.hadoop.hdfs.server.federation.statestore.model.MembershipStateEntity._expirationMs from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java:from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java: [line 164]   Write to static field org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity._expirationMs from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java:from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java: [line 174]   Write to static field org.apache.hadoop.hdfs.server.federation.statestore.model.RouterEntity._expirationMs from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java:from instance method org.apache.hadoop.hdfs.server.federation.statestore.StateStore.init(Configuration) At StateStore.java: [line 169]   org.apache.hadoop.hdfs.server.federation.statestore.StateStore.overrideRegistration(String, String, MembershipStateEntity$FederationNamenodeServiceState) does not release lock on all exception paths At StateStore.java:release lock on all exception paths At StateStore.java: [line 848]   org.apache.hadoop.hdfs.server.federation.statestore.impl.MountTableManager.getDestination(String) does not release lock on all exception paths At MountTableManager.java:on all exception paths At MountTableManager.java: [line 107]   org.apache.hadoop.hdfs.server.federation.statestore.impl.MountTableManager$MountTableTreeNode.toString(int) concatenates strings using + in a loop At MountTableManager.java:in a loop At MountTableManager.java: [line 331]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient Boolean constructor; use Boolean.valueOf(...) instead At StateStoreDriver.java:constructor; use Boolean.valueOf(...) instead At StateStoreDriver.java: [line 411]   Primitive boxed just to call toString in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) At StateStoreDriver.java:toString in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) At StateStoreDriver.java: [line 382]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient new Integer(String) constructor; use Integer.valueOf(String) instead At StateStoreDriver.java:Integer(String) constructor; use Integer.valueOf(String) instead At StateStoreDriver.java: [line 409]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient new Long(String) constructor; use Long.valueOf(String) instead At StateStoreDriver.java:Long(String) constructor; use Long.valueOf(String) instead At StateStoreDriver.java: [line 407]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) invokes inefficient new Long(long) constructor; use Long.valueOf(long) instead At StateStoreDriver.java:Long(long) constructor; use Long.valueOf(long) instead At StateStoreDriver.java: [line 382]   Redundant nullcheck of results, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.getSingle(Class, Map) Redundant null check at StateStoreDriver.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.getSingle(Class, Map) Redundant null check at StateStoreDriver.java: [line 204]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.filterMultiple(Map, Iterable) makes inefficient use of keySet iterator instead of entrySet iterator At StateStoreDriver.java:of keySet iterator instead of entrySet iterator At StateStoreDriver.java: [line 120]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.readAllLinesFromFile(String):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.readAllLinesFromFile(String): new java.io.FileReader(File) At StateStoreFile.java: [line 191]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.writeAll(Collection, Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.writeAll(Collection, Class): new java.io.FileWriter(File, boolean) At StateStoreFile.java: [line 276]   Exceptional return value of java.io.File.createNewFile() ignored in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.init(Configuration, StateStore) At StateStoreFile.java:ignored in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.init(Configuration, StateStore) At StateStoreFile.java: [line 118]   Exceptional return value of java.io.File.mkdirs() ignored in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.init(Configuration, StateStore) At StateStoreFile.java:ignored in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.init(Configuration, StateStore) At StateStoreFile.java: [line 106]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.getAll(Class) does not release lock on all exception paths At StateStoreFile.java:on all exception paths At StateStoreFile.java: [line 241]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.writeAll(Collection, Class) does not release lock on all exception paths At StateStoreFile.java:lock on all exception paths At StateStoreFile.java: [line 272]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getAll(Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getAll(Class): new java.io.InputStreamReader(InputStream) At StateStoreFileSystem.java: [line 113]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.writeAll(Collection, Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.writeAll(Collection, Class): new java.io.OutputStreamWriter(OutputStream) At StateStoreFileSystem.java: [line 163]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getAll(Class) may fail to close stream At StateStoreFileSystem.java:stream At StateStoreFileSystem.java: [line 114]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.writeAll(Collection, Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.writeAll(Collection, Class): String.getBytes() At StateStoreZooKeeper.java: [line 402]   Inconsistent synchronization of org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.zkAuths; locked 50% of time Unsynchronized access at StateStoreZooKeeper.java:50% of time Unsynchronized access at StateStoreZooKeeper.java: [line 127]   Inconsistent synchronization of org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.zkHost; locked 50% of time Unsynchronized access at StateStoreZooKeeper.java:50% of time Unsynchronized access at StateStoreZooKeeper.java: [line 123]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData.getData():in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData.getData(): new String(byte[]) At StateStoreZooKeeper.java: [line 285]   Should org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData be a static inner class? At StateStoreZooKeeper.java:inner class? At StateStoreZooKeeper.java: [lines 280-285]   Exception is caught when Exception is not thrown in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.serialize(StateStoreDriver) At BaseEntity.java:is not thrown in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.serialize(StateStoreDriver) At BaseEntity.java: [line 325]   org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.generateCacheKey(Map) makes inefficient use of keySet iterator instead of entrySet iterator At BaseEntity.java:keySet iterator instead of entrySet iterator At BaseEntity.java: [line 179]   Nullcheck of BaseEntity.dateModified at line 222 of value previously dereferenced in org.apache.hadoop.hdfs.server.federation.statestore.model.MembershipStateEntity.inflate(long) At MembershipStateEntity.java:222 of value previously dereferenced in org.apache.hadoop.hdfs.server.federation.statestore.model.MembershipStateEntity.inflate(long) At MembershipStateEntity.java: [line 220]   Invocation of toString on components in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java: [line 157]   Invocation of toString on subComponents in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java: [line 167]   new org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity(String, Map) makes inefficient use of keySet iterator instead of entrySet iterator At MountTableEntity.java:use of keySet iterator instead of entrySet iterator At MountTableEntity.java: [line 80] Failed junit tests hadoop.hdfs.server.federation.statestore.TestMountTable   hadoop.tools.TestHdfsConfigFields   hadoop.hdfs.server.namenode.TestEditLog   hadoop.hdfs.TestAsyncHDFSWithHA Timed out junit tests org.apache.hadoop.hdfs.server.federation.router.TestRouterSafemode   org.apache.hadoop.hdfs.server.federation.router.TestRouterJmx   org.apache.hadoop.hdfs.server.federation.statestore.TestStateStore   org.apache.hadoop.hdfs.server.federation.router.TestRouterAdmin   org.apache.hadoop.hdfs.server.federation.statestore.TestNamenodeRegistration   org.apache.hadoop.hdfs.server.federation.router.TestRouterRpc Subsystem Report/Notes Docker Image:yetus/hadoop:2c91fd8 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12808728/HDFS-10467.PoC.patch JIRA Issue HDFS-10467 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml shellcheck shelldocs cc uname Linux aef25500fe88 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / be34e85 Default Java 1.8.0_91 shellcheck v0.4.4 findbugs v3.0.0 javac https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/diff-compile-javac-root.txt checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/diff-checkstyle-root.txt shellcheck https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/diff-patch-shellcheck.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/whitespace-tabs.txt findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html javadoc https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/diff-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15692/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/15692/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: . Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15692/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          zhz Zhe Zhang added a comment -

          Thanks. Impressive that only 12 deletions

          Show
          zhz Zhe Zhang added a comment - Thanks. Impressive that only 12 deletions
          Hide
          elgoiri Íñigo Goiri added a comment -

          And we targeted to minimize the impact on those 12 classes
          Actually, we expect that based on feedback we can reduce the impact on the Client and the Server.
          Right now, we are using those extensions to allow more connections between Router and NameNode.

          Show
          elgoiri Íñigo Goiri added a comment - And we targeted to minimize the impact on those 12 classes Actually, we expect that based on feedback we can reduce the impact on the Client and the Server . Right now, we are using those extensions to allow more connections between Router and NameNode .
          Hide
          zhz Zhe Zhang added a comment -

          Read more about the design and patch. Looks really interesting. Great work here Íñigo Goiri. Below are some questions and comments:

          1. Have you considered and compared with the option where client first checks with Router to get NN address, before doing actual RPCs? Or client directly checks the mount table at StateStore, to get NN address?
          2. Have you considered using hard-linking on DNs for rebalancing?
          3. Just to clarify, in the posted design and patch, is Subcluster Rebalancer a tool that should always be manually started? Or is some form of automatic rebalancing in scope? In the patch, which component / class contains the logic of Rebalancer? The Rebalancer interface doesn't look like it.
          4. We may also find (4) scenarios where too much load or high space requirements in a subcluster start to interfere with the primary tenants of the subcluster.

            What are "primary tenants" in this context? Non-Hadoop workloads running on the same physical nodes?

          5. Locking the mount entry during rebalancing sounds too disruptive to applications. Alternatively, we can abort the rebalancing when there is an incoming write? Coupled with the 5.2.1 Precondition, the chances of aborted rebalancing shouldn't be too high.
          6. Locking a mount point is a little tricky. Technically, an HDFS client has full control on the local config, and can be configured to directly talk to the NN of a subtree. In a production environment, this could be a legacy config file, or temporary workaround to bypass router. This could lead to data corruption. Not sure if we should consider adding subtree locking in HDFS (any previous discussions / JIRAs)?
          7. About the rebalancing protocol in 5.6:
            • In step 1, can we simplify it by adding a limit that at most 1 rebalancing effort at any given time? So a new rebalancing effort would have to wait for the current rebalancing to either finish or be aborted.
            • Steps 7 and 9 involves waiting for all routers to acknowledge some state change. Is this too heavyweight? Who maintains router memberships? Are we doing this because of router caching of mount table data?
          Show
          zhz Zhe Zhang added a comment - Read more about the design and patch. Looks really interesting. Great work here Íñigo Goiri . Below are some questions and comments: Have you considered and compared with the option where client first checks with Router to get NN address, before doing actual RPCs? Or client directly checks the mount table at StateStore, to get NN address? Have you considered using hard-linking on DNs for rebalancing? Just to clarify, in the posted design and patch, is Subcluster Rebalancer a tool that should always be manually started? Or is some form of automatic rebalancing in scope? In the patch, which component / class contains the logic of Rebalancer? The Rebalancer interface doesn't look like it. We may also find (4) scenarios where too much load or high space requirements in a subcluster start to interfere with the primary tenants of the subcluster. What are "primary tenants" in this context? Non-Hadoop workloads running on the same physical nodes? Locking the mount entry during rebalancing sounds too disruptive to applications. Alternatively, we can abort the rebalancing when there is an incoming write? Coupled with the 5.2.1 Precondition, the chances of aborted rebalancing shouldn't be too high. Locking a mount point is a little tricky. Technically, an HDFS client has full control on the local config, and can be configured to directly talk to the NN of a subtree. In a production environment, this could be a legacy config file, or temporary workaround to bypass router. This could lead to data corruption. Not sure if we should consider adding subtree locking in HDFS (any previous discussions / JIRAs)? About the rebalancing protocol in 5.6: In step 1, can we simplify it by adding a limit that at most 1 rebalancing effort at any given time? So a new rebalancing effort would have to wait for the current rebalancing to either finish or be aborted. Steps 7 and 9 involves waiting for all routers to acknowledge some state change. Is this too heavyweight? Who maintains router memberships? Are we doing this because of router caching of mount table data?
          Hide
          elgoiri Íñigo Goiri added a comment -

          Zhe Zhang, thanks for the feedback. Some clarifications to your comments:

          1. We considered modifying ViewFs to check a remote and centralized mount table (I think this option is pretty much what you propose, right?). We didn't go this route for a couple reasons: (1) modifications to the client, and (2) challenging rebalance. In addition, with our approach we get some side advantages like a unified view of the federation, a Router to isolate the NameNodes from the clients, and better HA management.
          2. We haven't gone into hard-linking of DNs but that could be an improvement to the DistCp approach. We are open to improvements there but it might imply changes to the NNs.
          3. Our current implementation of the Subcluster Rebalancer is a tool similar to the regular Rebalancer and is also manually triggered. Right now, it's in a separate package, I can post a patch just for it. Our ultimate goal is to have some service that monitors the subclusters and triggers the proper Subcluster Rebalancer operation (this is work in progress).
          4. In our environment, we co-locate with other services (related to YARN-5215). I think this is orthogonal to the rebalancing but we can always go into that.
          5. The rebalancing itself is the most open part at this point. We've been targeting a tool that supports as many options as possible and let's the admin decide. For now, we support both locking and not locking.
          6. At some point we considered NN level locking. Actually, Gera Shegalov had a couple proposals for this based on permissions. We can refine this over time and maybe even implement locking at NN level.
          7. Regading the rebalancing protocol, as I said we are targetting to make it as broad as possible and allow the amdin to pick their options.
          • I think it'd be better to support rebalancing of different subtrees at the same time. Only rebalancing within a subtree that is under rebalancing would be disallowed. We can always add options for that.
          • Again this is an option we added based on internal feedback, the Subcluster Rebalancer has an option to wait or not. The Router membership is in the State Store and it's done by the Router; this is already in the PoC patch. And yes, the main reason to do this is the caching of the mount table. Having the Router membership is also useful from an administration point of view to see the whole status of the federation.

          In general, I think we should start a separate effort for the Subcluster Rebalancer as it has many design choices that can be changed. Obviously we also need to transform this into an umbrella, right now is too big. If people is positive about this effort, we should start discussing ways to split the effort.

          Show
          elgoiri Íñigo Goiri added a comment - Zhe Zhang , thanks for the feedback. Some clarifications to your comments: We considered modifying ViewFs to check a remote and centralized mount table (I think this option is pretty much what you propose, right?). We didn't go this route for a couple reasons: (1) modifications to the client, and (2) challenging rebalance. In addition, with our approach we get some side advantages like a unified view of the federation, a Router to isolate the NameNodes from the clients, and better HA management. We haven't gone into hard-linking of DNs but that could be an improvement to the DistCp approach. We are open to improvements there but it might imply changes to the NNs. Our current implementation of the Subcluster Rebalancer is a tool similar to the regular Rebalancer and is also manually triggered. Right now, it's in a separate package, I can post a patch just for it. Our ultimate goal is to have some service that monitors the subclusters and triggers the proper Subcluster Rebalancer operation (this is work in progress). In our environment, we co-locate with other services (related to YARN-5215 ). I think this is orthogonal to the rebalancing but we can always go into that. The rebalancing itself is the most open part at this point. We've been targeting a tool that supports as many options as possible and let's the admin decide. For now, we support both locking and not locking. At some point we considered NN level locking. Actually, Gera Shegalov had a couple proposals for this based on permissions. We can refine this over time and maybe even implement locking at NN level. Regading the rebalancing protocol, as I said we are targetting to make it as broad as possible and allow the amdin to pick their options. I think it'd be better to support rebalancing of different subtrees at the same time. Only rebalancing within a subtree that is under rebalancing would be disallowed. We can always add options for that. Again this is an option we added based on internal feedback, the Subcluster Rebalancer has an option to wait or not. The Router membership is in the State Store and it's done by the Router; this is already in the PoC patch. And yes, the main reason to do this is the caching of the mount table. Having the Router membership is also useful from an administration point of view to see the whole status of the federation. In general, I think we should start a separate effort for the Subcluster Rebalancer as it has many design choices that can be changed. Obviously we also need to transform this into an umbrella, right now is too big. If people is positive about this effort, we should start discussing ways to split the effort.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Refactored state store and added more unit tests.

          Show
          elgoiri Íñigo Goiri added a comment - Refactored state store and added more unit tests.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 25s Docker mode activated.
          0 shelldocs 0m 1s Shelldocs was not available.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 15 new or modified test files.
          0 mvndep 0m 41s Maven dependency ordering for branch
          +1 mvninstall 6m 43s trunk passed
          +1 compile 7m 14s trunk passed
          +1 checkstyle 1m 45s trunk passed
          +1 mvnsite 1m 56s trunk passed
          +1 mvneclipse 0m 31s trunk passed
          +1 findbugs 3m 8s trunk passed
          +1 javadoc 1m 45s trunk passed
          0 mvndep 0m 12s Maven dependency ordering for patch
          +1 mvninstall 1m 29s the patch passed
          +1 compile 6m 49s the patch passed
          +1 cc 6m 49s the patch passed
          -1 javac 6m 49s root generated 3 new + 706 unchanged - 2 fixed = 709 total (was 708)
          -0 checkstyle 1m 48s root: The patch generated 478 new + 1183 unchanged - 5 fixed = 1661 total (was 1188)
          +1 mvnsite 2m 1s the patch passed
          +1 mvneclipse 0m 31s the patch passed
          -1 shellcheck 0m 12s The patch generated 4 new + 74 unchanged - 1 fixed = 78 total (was 75)
          -1 whitespace 0m 1s The patch 36 line(s) with tabs.
          +1 xml 0m 2s The patch has no ill-formed XML file.
          -1 findbugs 2m 7s hadoop-hdfs-project/hadoop-hdfs generated 37 new + 0 unchanged - 0 fixed = 37 total (was 0)
          -1 javadoc 1m 1s hadoop-hdfs-project_hadoop-hdfs generated 4 new + 7 unchanged - 0 fixed = 11 total (was 7)
          +1 unit 8m 48s hadoop-common in the patch passed.
          -1 unit 60m 50s hadoop-hdfs in the patch failed.
          -1 asflicense 0m 28s The patch generated 11 ASF License warnings.
          114m 18s



          Reason Tests
          FindBugs module:hadoop-hdfs-project/hadoop-hdfs
            org.apache.hadoop.hdfs.server.federation.locator.PathTreeNode.toString(int) concatenates strings using + in a loop At PathTreeNode.java:in a loop At PathTreeNode.java:[line 130]
            Unread field:StateStoreMetrics.java:[line 56]
            Synchronization performed on java.util.concurrent.CopyOnWriteArrayList in org.apache.hadoop.hdfs.server.federation.router.ConnectionManager$1.run() At ConnectionManager.java:org.apache.hadoop.hdfs.server.federation.router.ConnectionManager$1.run() At ConnectionManager.java:[line 138]
            Synchronization performed on java.util.concurrent.CopyOnWriteArrayList in org.apache.hadoop.hdfs.server.federation.router.ConnectionManager$CleanupTask.run() At ConnectionManager.java:org.apache.hadoop.hdfs.server.federation.router.ConnectionManager$CleanupTask.run() At ConnectionManager.java:[line 237]
            Should org.apache.hadoop.hdfs.server.federation.router.NamenodeHearbeatService$NamenodeStatusReport be a static inner class? At NamenodeHearbeatService.java:inner class? At NamenodeHearbeatService.java:[lines 81-131]
            Unread public/protected field:At NamenodeHearbeatService.java:[line 85]
            org.apache.hadoop.hdfs.server.federation.router.Router.initAndStartRouter(Configuration, boolean) invokes System.exit(...), which shuts down the entire virtual machine At Router.java:shuts down the entire virtual machine At Router.java:[line 578]
            Exception is caught when Exception is not thrown in org.apache.hadoop.hdfs.server.federation.router.Router.getConnection(MembershipStateEntity) At Router.java:is not thrown in org.apache.hadoop.hdfs.server.federation.router.Router.getConnection(MembershipStateEntity) At Router.java:[line 963]
            org.apache.hadoop.hdfs.server.federation.router.Router$CustomSocketFactory doesn't override org.apache.hadoop.net.StandardSocketFactory.equals(Object) At Router.java:At Router.java:[line 1]
            Should org.apache.hadoop.hdfs.server.federation.router.Router$CustomSocketFactory be a static inner class? At Router.java:inner class? At Router.java:[lines 366-374]
            Redundant nullcheck of connection, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer$DefaultLookup.getClient() Redundant null check at RouterRpcServer.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer$DefaultLookup.getClient() Redundant null check at RouterRpcServer.java:[line 950]
            org.apache.hadoop.hdfs.server.federation.statestore.StateStore.overrideRegistration(String, String, MembershipStateEntity$FederationNamenodeServiceState) does not release lock on all exception paths At StateStore.java:release lock on all exception paths At StateStore.java:[line 784]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient Boolean constructor; use Boolean.valueOf(...) instead At StateStoreDriver.java:constructor; use Boolean.valueOf(...) instead At StateStoreDriver.java:[line 487]
            Primitive boxed just to call toString in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) At StateStoreDriver.java:toString in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) At StateStoreDriver.java:[line 453]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient new Integer(String) constructor; use Integer.valueOf(String) instead At StateStoreDriver.java:Integer(String) constructor; use Integer.valueOf(String) instead At StateStoreDriver.java:[line 485]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient new Long(String) constructor; use Long.valueOf(String) instead At StateStoreDriver.java:Long(String) constructor; use Long.valueOf(String) instead At StateStoreDriver.java:[line 483]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) invokes inefficient new Long(long) constructor; use Long.valueOf(long) instead At StateStoreDriver.java:Long(long) constructor; use Long.valueOf(long) instead At StateStoreDriver.java:[line 453]
            Redundant nullcheck of results, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.getSingle(Class, Map) Redundant null check at StateStoreDriver.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.getSingle(Class, Map) Redundant null check at StateStoreDriver.java:[line 288]
            org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.filterMultiple(Map, Iterable) makes inefficient use of keySet iterator instead of entrySet iterator At StateStoreDriver.java:of keySet iterator instead of entrySet iterator At StateStoreDriver.java:[line 206]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.getReader(Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.getReader(Class): new java.io.FileReader(File) At StateStoreFile.java:[line 103]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.getWriter(Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.getWriter(Class): new java.io.FileWriter(File, boolean) At StateStoreFile.java:[line 119]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getReader(Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getReader(Class): new java.io.InputStreamReader(InputStream) At StateStoreFileSystem.java:[line 132]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getWriter(Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getWriter(Class): new java.io.OutputStreamWriter(OutputStream) At StateStoreFileSystem.java:[line 149]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.writeAll(Collection, Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.writeAll(Collection, Class): String.getBytes() At StateStoreZooKeeper.java:[line 380]
            Inconsistent synchronization of org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.zkAuths; locked 50% of time Unsynchronized access at StateStoreZooKeeper.java:50% of time Unsynchronized access at StateStoreZooKeeper.java:[line 96]
            Inconsistent synchronization of org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.zkHost; locked 50% of time Unsynchronized access at StateStoreZooKeeper.java:50% of time Unsynchronized access at StateStoreZooKeeper.java:[line 92]
            Redundant nullcheck of znode, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.deleteAll(Class) Redundant null check at StateStoreZooKeeper.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.deleteAll(Class) Redundant null check at StateStoreZooKeeper.java:[line 451]
            Redundant nullcheck of znode, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.getAll(Class) Redundant null check at StateStoreZooKeeper.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.getAll(Class) Redundant null check at StateStoreZooKeeper.java:[line 333]
            Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData.getData():in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData.getData(): new String(byte[]) At StateStoreZooKeeper.java:[line 261]
            Should org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData be a static inner class? At StateStoreZooKeeper.java:inner class? At StateStoreZooKeeper.java:[lines 256-261]
            Invocation of toString on components in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.createEntity(String, StateStoreDriver, Class, boolean) At BaseEntity.java:in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.createEntity(String, StateStoreDriver, Class, boolean) At BaseEntity.java:[line 257]
            Exception is caught when Exception is not thrown in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.serialize(StateStoreDriver, boolean) At BaseEntity.java:is not thrown in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.serialize(StateStoreDriver, boolean) At BaseEntity.java:[line 311]
            org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.generateMashupKey(Map) makes inefficient use of keySet iterator instead of entrySet iterator At BaseEntity.java:keySet iterator instead of entrySet iterator At BaseEntity.java:[line 180]
            Nullcheck of BaseEntity.dateModified at line 215 of value previously dereferenced in org.apache.hadoop.hdfs.server.federation.statestore.model.MembershipStateEntity.inflate(long) At MembershipStateEntity.java:215 of value previously dereferenced in org.apache.hadoop.hdfs.server.federation.statestore.model.MembershipStateEntity.inflate(long) At MembershipStateEntity.java:[line 213]
            Invocation of toString on components in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:[line 149]
            Invocation of toString on subComponents in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:[line 159]
            new org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity(String, Map) makes inefficient use of keySet iterator instead of entrySet iterator At MountTableEntity.java:use of keySet iterator instead of entrySet iterator At MountTableEntity.java:[line 74]
          Failed junit tests hadoop.hdfs.server.federation.router.TestRouterRpc
            hadoop.hdfs.server.federation.statestore.TestStateStore
            hadoop.hdfs.server.federation.statestore.impl.TestStateStoreZK
            hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
            hadoop.hdfs.server.federation.router.TestRouterJmx
            hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery
            hadoop.tools.TestHdfsConfigFields
            hadoop.hdfs.server.federation.statestore.TestNamenodeRegistration
            hadoop.hdfs.server.federation.router.TestRouterAdmin
            hadoop.hdfs.server.federation.router.TestRouterSafemode
            hadoop.hdfs.server.federation.statestore.impl.TestStateStoreFileSystem



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:85209cc
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch
          JIRA Issue HDFS-10467
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml shellcheck shelldocs cc
          uname Linux e6d858bd83ea 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 4009fa3
          Default Java 1.8.0_91
          shellcheck v0.4.4
          findbugs v3.0.0
          javac https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/diff-compile-javac-root.txt
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/diff-checkstyle-root.txt
          shellcheck https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/diff-patch-shellcheck.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/whitespace-tabs.txt
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
          javadoc https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/diff-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15964/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: .
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15964/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 25s Docker mode activated. 0 shelldocs 0m 1s Shelldocs was not available. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 15 new or modified test files. 0 mvndep 0m 41s Maven dependency ordering for branch +1 mvninstall 6m 43s trunk passed +1 compile 7m 14s trunk passed +1 checkstyle 1m 45s trunk passed +1 mvnsite 1m 56s trunk passed +1 mvneclipse 0m 31s trunk passed +1 findbugs 3m 8s trunk passed +1 javadoc 1m 45s trunk passed 0 mvndep 0m 12s Maven dependency ordering for patch +1 mvninstall 1m 29s the patch passed +1 compile 6m 49s the patch passed +1 cc 6m 49s the patch passed -1 javac 6m 49s root generated 3 new + 706 unchanged - 2 fixed = 709 total (was 708) -0 checkstyle 1m 48s root: The patch generated 478 new + 1183 unchanged - 5 fixed = 1661 total (was 1188) +1 mvnsite 2m 1s the patch passed +1 mvneclipse 0m 31s the patch passed -1 shellcheck 0m 12s The patch generated 4 new + 74 unchanged - 1 fixed = 78 total (was 75) -1 whitespace 0m 1s The patch 36 line(s) with tabs. +1 xml 0m 2s The patch has no ill-formed XML file. -1 findbugs 2m 7s hadoop-hdfs-project/hadoop-hdfs generated 37 new + 0 unchanged - 0 fixed = 37 total (was 0) -1 javadoc 1m 1s hadoop-hdfs-project_hadoop-hdfs generated 4 new + 7 unchanged - 0 fixed = 11 total (was 7) +1 unit 8m 48s hadoop-common in the patch passed. -1 unit 60m 50s hadoop-hdfs in the patch failed. -1 asflicense 0m 28s The patch generated 11 ASF License warnings. 114m 18s Reason Tests FindBugs module:hadoop-hdfs-project/hadoop-hdfs   org.apache.hadoop.hdfs.server.federation.locator.PathTreeNode.toString(int) concatenates strings using + in a loop At PathTreeNode.java:in a loop At PathTreeNode.java: [line 130]   Unread field:StateStoreMetrics.java: [line 56]   Synchronization performed on java.util.concurrent.CopyOnWriteArrayList in org.apache.hadoop.hdfs.server.federation.router.ConnectionManager$1.run() At ConnectionManager.java:org.apache.hadoop.hdfs.server.federation.router.ConnectionManager$1.run() At ConnectionManager.java: [line 138]   Synchronization performed on java.util.concurrent.CopyOnWriteArrayList in org.apache.hadoop.hdfs.server.federation.router.ConnectionManager$CleanupTask.run() At ConnectionManager.java:org.apache.hadoop.hdfs.server.federation.router.ConnectionManager$CleanupTask.run() At ConnectionManager.java: [line 237]   Should org.apache.hadoop.hdfs.server.federation.router.NamenodeHearbeatService$NamenodeStatusReport be a static inner class? At NamenodeHearbeatService.java:inner class? At NamenodeHearbeatService.java: [lines 81-131]   Unread public/protected field:At NamenodeHearbeatService.java: [line 85]   org.apache.hadoop.hdfs.server.federation.router.Router.initAndStartRouter(Configuration, boolean) invokes System.exit(...), which shuts down the entire virtual machine At Router.java:shuts down the entire virtual machine At Router.java: [line 578]   Exception is caught when Exception is not thrown in org.apache.hadoop.hdfs.server.federation.router.Router.getConnection(MembershipStateEntity) At Router.java:is not thrown in org.apache.hadoop.hdfs.server.federation.router.Router.getConnection(MembershipStateEntity) At Router.java: [line 963]   org.apache.hadoop.hdfs.server.federation.router.Router$CustomSocketFactory doesn't override org.apache.hadoop.net.StandardSocketFactory.equals(Object) At Router.java:At Router.java: [line 1]   Should org.apache.hadoop.hdfs.server.federation.router.Router$CustomSocketFactory be a static inner class? At Router.java:inner class? At Router.java: [lines 366-374]   Redundant nullcheck of connection, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer$DefaultLookup.getClient() Redundant null check at RouterRpcServer.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer$DefaultLookup.getClient() Redundant null check at RouterRpcServer.java: [line 950]   org.apache.hadoop.hdfs.server.federation.statestore.StateStore.overrideRegistration(String, String, MembershipStateEntity$FederationNamenodeServiceState) does not release lock on all exception paths At StateStore.java:release lock on all exception paths At StateStore.java: [line 784]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient Boolean constructor; use Boolean.valueOf(...) instead At StateStoreDriver.java:constructor; use Boolean.valueOf(...) instead At StateStoreDriver.java: [line 487]   Primitive boxed just to call toString in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) At StateStoreDriver.java:toString in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) At StateStoreDriver.java: [line 453]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient new Integer(String) constructor; use Integer.valueOf(String) instead At StateStoreDriver.java:Integer(String) constructor; use Integer.valueOf(String) instead At StateStoreDriver.java: [line 485]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.deserializeObject(Object, Class) invokes inefficient new Long(String) constructor; use Long.valueOf(String) instead At StateStoreDriver.java:Long(String) constructor; use Long.valueOf(String) instead At StateStoreDriver.java: [line 483]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.serializeObject(Object, Class) invokes inefficient new Long(long) constructor; use Long.valueOf(long) instead At StateStoreDriver.java:Long(long) constructor; use Long.valueOf(long) instead At StateStoreDriver.java: [line 453]   Redundant nullcheck of results, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.getSingle(Class, Map) Redundant null check at StateStoreDriver.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.getSingle(Class, Map) Redundant null check at StateStoreDriver.java: [line 288]   org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreDriver.filterMultiple(Map, Iterable) makes inefficient use of keySet iterator instead of entrySet iterator At StateStoreDriver.java:of keySet iterator instead of entrySet iterator At StateStoreDriver.java: [line 206]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.getReader(Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.getReader(Class): new java.io.FileReader(File) At StateStoreFile.java: [line 103]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.getWriter(Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFile.getWriter(Class): new java.io.FileWriter(File, boolean) At StateStoreFile.java: [line 119]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getReader(Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getReader(Class): new java.io.InputStreamReader(InputStream) At StateStoreFileSystem.java: [line 132]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getWriter(Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreFileSystem.getWriter(Class): new java.io.OutputStreamWriter(OutputStream) At StateStoreFileSystem.java: [line 149]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.writeAll(Collection, Class):in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.writeAll(Collection, Class): String.getBytes() At StateStoreZooKeeper.java: [line 380]   Inconsistent synchronization of org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.zkAuths; locked 50% of time Unsynchronized access at StateStoreZooKeeper.java:50% of time Unsynchronized access at StateStoreZooKeeper.java: [line 96]   Inconsistent synchronization of org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.zkHost; locked 50% of time Unsynchronized access at StateStoreZooKeeper.java:50% of time Unsynchronized access at StateStoreZooKeeper.java: [line 92]   Redundant nullcheck of znode, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.deleteAll(Class) Redundant null check at StateStoreZooKeeper.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.deleteAll(Class) Redundant null check at StateStoreZooKeeper.java: [line 451]   Redundant nullcheck of znode, which is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.getAll(Class) Redundant null check at StateStoreZooKeeper.java:is known to be non-null in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper.getAll(Class) Redundant null check at StateStoreZooKeeper.java: [line 333]   Found reliance on default encoding in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData.getData():in org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData.getData(): new String(byte[]) At StateStoreZooKeeper.java: [line 261]   Should org.apache.hadoop.hdfs.server.federation.statestore.impl.StateStoreZooKeeper$ZKData be a static inner class? At StateStoreZooKeeper.java:inner class? At StateStoreZooKeeper.java: [lines 256-261]   Invocation of toString on components in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.createEntity(String, StateStoreDriver, Class, boolean) At BaseEntity.java:in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.createEntity(String, StateStoreDriver, Class, boolean) At BaseEntity.java: [line 257]   Exception is caught when Exception is not thrown in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.serialize(StateStoreDriver, boolean) At BaseEntity.java:is not thrown in org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.serialize(StateStoreDriver, boolean) At BaseEntity.java: [line 311]   org.apache.hadoop.hdfs.server.federation.statestore.model.BaseEntity.generateMashupKey(Map) makes inefficient use of keySet iterator instead of entrySet iterator At BaseEntity.java:keySet iterator instead of entrySet iterator At BaseEntity.java: [line 180]   Nullcheck of BaseEntity.dateModified at line 215 of value previously dereferenced in org.apache.hadoop.hdfs.server.federation.statestore.model.MembershipStateEntity.inflate(long) At MembershipStateEntity.java:215 of value previously dereferenced in org.apache.hadoop.hdfs.server.federation.statestore.model.MembershipStateEntity.inflate(long) At MembershipStateEntity.java: [line 213]   Invocation of toString on components in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java: [line 149]   Invocation of toString on subComponents in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java:in org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity.getLocations(String) At MountTableEntity.java: [line 159]   new org.apache.hadoop.hdfs.server.federation.statestore.model.MountTableEntity(String, Map) makes inefficient use of keySet iterator instead of entrySet iterator At MountTableEntity.java:use of keySet iterator instead of entrySet iterator At MountTableEntity.java: [line 74] Failed junit tests hadoop.hdfs.server.federation.router.TestRouterRpc   hadoop.hdfs.server.federation.statestore.TestStateStore   hadoop.hdfs.server.federation.statestore.impl.TestStateStoreZK   hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer   hadoop.hdfs.server.federation.router.TestRouterJmx   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery   hadoop.tools.TestHdfsConfigFields   hadoop.hdfs.server.federation.statestore.TestNamenodeRegistration   hadoop.hdfs.server.federation.router.TestRouterAdmin   hadoop.hdfs.server.federation.router.TestRouterSafemode   hadoop.hdfs.server.federation.statestore.impl.TestStateStoreFileSystem Subsystem Report/Notes Docker Image:yetus/hadoop:85209cc JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch JIRA Issue HDFS-10467 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml shellcheck shelldocs cc uname Linux e6d858bd83ea 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 4009fa3 Default Java 1.8.0_91 shellcheck v0.4.4 findbugs v3.0.0 javac https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/diff-compile-javac-root.txt checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/diff-checkstyle-root.txt shellcheck https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/diff-patch-shellcheck.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/whitespace-tabs.txt findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html javadoc https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/diff-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15964/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/15964/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: . Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15964/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          elgoiri Íñigo Goiri added a comment -

          This optimization increases significantly the performance of the Router.

          Show
          elgoiri Íñigo Goiri added a comment - This optimization increases significantly the performance of the Router.
          Hide
          elgoiri Íñigo Goiri added a comment -

          After checking the code, I think there might a bunch of overlaps between this work and YARN-2915. I'd like to explore what we could move into Hadoop commons to manage a federated space. I would probably open a new JIRA for that.

          In addition, given the feedback collected during the last few weeks, it seems like the community is OK with going into this direction so I'd like to start moving the review process forward.
          To simplify the review, I propose to convert this JIRA into an umbrella and split the current patch into smaller subtasks. For now, I would like to start with:

          1. Minimum Router
          2. State Store interface
          3. ZooKeeper State Store implementation

          We can add more tasks if people think is the way to do. Probably, it's a good idea to create a new branch for this effort. Thoughts? Opinions?

          Show
          elgoiri Íñigo Goiri added a comment - After checking the code, I think there might a bunch of overlaps between this work and YARN-2915 . I'd like to explore what we could move into Hadoop commons to manage a federated space. I would probably open a new JIRA for that. In addition, given the feedback collected during the last few weeks, it seems like the community is OK with going into this direction so I'd like to start moving the review process forward. To simplify the review, I propose to convert this JIRA into an umbrella and split the current patch into smaller subtasks. For now, I would like to start with: Minimum Router State Store interface ZooKeeper State Store implementation We can add more tasks if people think is the way to do. Probably, it's a good idea to create a new branch for this effort. Thoughts? Opinions?
          Hide
          jingzhao Jing Zhao added a comment -

          Assigned the jira to Íñigo Goiri.

          Probably, it's a good idea to create a new branch for this effort.

          +1. I've created the feature branch HDFS-10467. Please feel free to use it for next step development.

          Show
          jingzhao Jing Zhao added a comment - Assigned the jira to Íñigo Goiri . Probably, it's a good idea to create a new branch for this effort. +1. I've created the feature branch HDFS-10467 . Please feel free to use it for next step development.
          Hide
          mingma Ming Ma added a comment -

          Íñigo Goiri, nice work! Here are couple more questions about the design. I will post code review questions in sub jiras.

          • Support for mergeFs HADOOP-8298. We should be able to extend the design to support this.There might be some issues around how to provision a new sub folder (which namespace should own that) and how it works with rebalancer. This could be a good addition for future work section.
          • Handling of inconsistent state. Given routers cache which namenodes are active, the state could be different from the actual namenode at that moment. Thus routers might get StandbyException and need to retry on another namenode. If so, does it mean the routers should leverage ipc FailoverOnNetworkExceptionRetry or use DFSClient with hint for active namenode?
          • Soft state vs hard state. while subcluster active namenode machine and load/space are soft state that can be reconstructed from namenodes; mount table is hard state that need to be persisted. Is there any benefit separating them out to use different state stores as they have different persistence requirement, access patterns(mount table does't change much while load/space update is frequent) and admin interface? For example, admin might want to update mount table on demand; but not load/space state.
          • Usage of subcluster load/space state. Is it correct that the only consumer of subcluster's load/space state is the rebalancer? I image initially we would run rebalancer manually. For that, the rebalancer can just pull subcluster's load/space state from namenodes on demand. Then we don't have to store subcluster load/space state in state store.
          • Admin's modification of mount table. Besides rebalancer, admin might want to update mount table during cluster initial setup as well as addition of new namespace with new mount entry. If we continue to use mounttable.xml, then admins can push the update the same way as viewFs setup. If we use ZK store, them we need to provide tools to update state store.
          • What is the performance optimization in your latest patch, based on async RPC client?
          Show
          mingma Ming Ma added a comment - Íñigo Goiri , nice work! Here are couple more questions about the design. I will post code review questions in sub jiras. Support for mergeFs HADOOP-8298 . We should be able to extend the design to support this.There might be some issues around how to provision a new sub folder (which namespace should own that) and how it works with rebalancer. This could be a good addition for future work section. Handling of inconsistent state. Given routers cache which namenodes are active, the state could be different from the actual namenode at that moment. Thus routers might get StandbyException and need to retry on another namenode. If so, does it mean the routers should leverage ipc FailoverOnNetworkExceptionRetry or use DFSClient with hint for active namenode? Soft state vs hard state. while subcluster active namenode machine and load/space are soft state that can be reconstructed from namenodes; mount table is hard state that need to be persisted. Is there any benefit separating them out to use different state stores as they have different persistence requirement, access patterns(mount table does't change much while load/space update is frequent) and admin interface? For example, admin might want to update mount table on demand; but not load/space state. Usage of subcluster load/space state. Is it correct that the only consumer of subcluster's load/space state is the rebalancer? I image initially we would run rebalancer manually. For that, the rebalancer can just pull subcluster's load/space state from namenodes on demand. Then we don't have to store subcluster load/space state in state store. Admin's modification of mount table. Besides rebalancer, admin might want to update mount table during cluster initial setup as well as addition of new namespace with new mount entry. If we continue to use mounttable.xml, then admins can push the update the same way as viewFs setup. If we use ZK store, them we need to provide tools to update state store. What is the performance optimization in your latest patch, based on async RPC client?
          Hide
          vinayrpet Vinayakumar B added a comment -

          Hi,
          This will be nice addition to Federation.

          Apparently, I was also working on similar feature, which almost has same design.

          Design looks great, I have some comments though.

          3.3.2 Namenode heartbeat HA
          For high availability and flexibility, multiple Routers can monitor the same Namenode and heartbeat the information to the State Store

          If a Router tries to contact the active Namenode but is unable to do it, the Router will try the other
          Namenodes in the subcluster.

          Why cant we use DFSClient with HA here directly? Like how current clients connects to HA, each subcluster can be connected to a subcluster using a HA configured DFSClient. DFSClient itself will handle switching between NNs in case of failover. This DFSClient can be kept as-is and re-used later for next requests on same subcluster. So it should know the current active namenode.
          By doing this, there will not be any need of Heartbeat between Router and NameNode to monitor the NameNode status.

          MountTable

          How about supporting a default Mount-point for '/' as well. Could be optional also? Instead of rejecting requests for the paths which doesnt match with other other mountpoints.
          There might be some usecases where there might be multiple first level directories other than mounted path. Which could go under /.
          Similar to Linux's FileSystem mounts.

          I will try to review the code on sub jiras.

          Show
          vinayrpet Vinayakumar B added a comment - Hi, This will be nice addition to Federation. Apparently, I was also working on similar feature, which almost has same design. Design looks great, I have some comments though. 3.3.2 Namenode heartbeat HA For high availability and flexibility, multiple Routers can monitor the same Namenode and heartbeat the information to the State Store If a Router tries to contact the active Namenode but is unable to do it, the Router will try the other Namenodes in the subcluster. Why cant we use DFSClient with HA here directly? Like how current clients connects to HA, each subcluster can be connected to a subcluster using a HA configured DFSClient. DFSClient itself will handle switching between NNs in case of failover. This DFSClient can be kept as-is and re-used later for next requests on same subcluster. So it should know the current active namenode. By doing this, there will not be any need of Heartbeat between Router and NameNode to monitor the NameNode status. MountTable How about supporting a default Mount-point for '/' as well. Could be optional also? Instead of rejecting requests for the paths which doesnt match with other other mountpoints. There might be some usecases where there might be multiple first level directories other than mounted path. Which could go under /. Similar to Linux's FileSystem mounts. I will try to review the code on sub jiras.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Ming Ma, thank you for the comments. A few answers/clarifictions.

          Support for mergeFs HADOOP-8298. We should be able to extend the design to support this.There might be some issues around how to provision a new sub folder (which namespace should own that) and how it works with rebalancer. This could be a good addition for future work section.

          In the prototype we actually started this but we haven't gone into testing with it. In addition, I think merge points go a little bit on the direction of N-Fly in HADOOP-12077. I think we should support both of them together. I'll add the reference explicitly to the document.

          Handling of inconsistent state. Given routers cache which namenodes are active, the state could be different from the actual namenode at that moment. Thus routers might get StandbyException and need to retry on another namenode. If so, does it mean the routers should leverage ipc FailoverOnNetworkExceptionRetry or use DFSClient with hint for active namenode?

          In the current implementation we use the client with the hint. We first try the one marked as active in the State Store and we capture StandbyExceptions etc. This is in HDFS-10629 in RouterRpcServer#invokeMethod().

          Soft state vs hard state. while subcluster active namenode machine and load/space are soft state that can be reconstructed from namenodes; mount table is hard state that need to be persisted. Is there any benefit separating them out to use different state stores as they have different persistence requirement, access patterns(mount table does't change much while load/space update is frequent) and admin interface? For example, admin might want to update mount table on demand; but not load/space state.

          True, this is easy to implement right now. We should see if people is OK with the additional complexity of configuring two backends. I guess we can discuss in HDFS-10630.

          Usage of subcluster load/space state. Is it correct that the only consumer of subcluster's load/space state is the rebalancer? I image initially we would run rebalancer manually. For that, the rebalancer can just pull subcluster's load/space state from namenodes on demand. Then we don't have to store subcluster load/space state in state store.

          Correct. Right now we are not even storing load/space data in the State Store. Actually in our Rebalancer prototypes, we are collecting the space externally. For now, we will keep the usage state out of the State Store and once we go into the Rebalancer, we can discuss what's best.

          Admin's modification of mount table. Besides rebalancer, admin might want to update mount table during cluster initial setup as well as addition of new namespace with new mount entry. If we continue to use mounttable.xml, then admins can push the update the same way as viewFs setup. If we use ZK store, them we need to provide tools to update state store.

          Right now, our admin tool goes through the Routers to modify the mount table. We could also go directly to the State Store. I just created HDFS-10646 to develop this.

          What is the performance optimization in your latest patch, based on async RPC client?

          Our current optimization is based on being able to use more sockets. The current client has a single thread pool per connection and we were limited by this. We haven't explored async extensively but we are not yet sure it will give us the performance we need. We need to explore this.

          I'll update the document accordingly.

          Show
          elgoiri Íñigo Goiri added a comment - Ming Ma , thank you for the comments. A few answers/clarifictions. Support for mergeFs HADOOP-8298 . We should be able to extend the design to support this.There might be some issues around how to provision a new sub folder (which namespace should own that) and how it works with rebalancer. This could be a good addition for future work section. In the prototype we actually started this but we haven't gone into testing with it. In addition, I think merge points go a little bit on the direction of N-Fly in HADOOP-12077 . I think we should support both of them together. I'll add the reference explicitly to the document. Handling of inconsistent state. Given routers cache which namenodes are active, the state could be different from the actual namenode at that moment. Thus routers might get StandbyException and need to retry on another namenode. If so, does it mean the routers should leverage ipc FailoverOnNetworkExceptionRetry or use DFSClient with hint for active namenode? In the current implementation we use the client with the hint. We first try the one marked as active in the State Store and we capture StandbyExceptions etc. This is in HDFS-10629 in RouterRpcServer#invokeMethod() . Soft state vs hard state. while subcluster active namenode machine and load/space are soft state that can be reconstructed from namenodes; mount table is hard state that need to be persisted. Is there any benefit separating them out to use different state stores as they have different persistence requirement, access patterns(mount table does't change much while load/space update is frequent) and admin interface? For example, admin might want to update mount table on demand; but not load/space state. True, this is easy to implement right now. We should see if people is OK with the additional complexity of configuring two backends. I guess we can discuss in HDFS-10630 . Usage of subcluster load/space state. Is it correct that the only consumer of subcluster's load/space state is the rebalancer? I image initially we would run rebalancer manually. For that, the rebalancer can just pull subcluster's load/space state from namenodes on demand. Then we don't have to store subcluster load/space state in state store. Correct. Right now we are not even storing load/space data in the State Store. Actually in our Rebalancer prototypes, we are collecting the space externally. For now, we will keep the usage state out of the State Store and once we go into the Rebalancer, we can discuss what's best. Admin's modification of mount table. Besides rebalancer, admin might want to update mount table during cluster initial setup as well as addition of new namespace with new mount entry. If we continue to use mounttable.xml, then admins can push the update the same way as viewFs setup. If we use ZK store, them we need to provide tools to update state store. Right now, our admin tool goes through the Routers to modify the mount table. We could also go directly to the State Store. I just created HDFS-10646 to develop this. What is the performance optimization in your latest patch, based on async RPC client? Our current optimization is based on being able to use more sockets. The current client has a single thread pool per connection and we were limited by this. We haven't explored async extensively but we are not yet sure it will give us the performance we need. We need to explore this. I'll update the document accordingly.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Thanks Vinayakumar B for the comments.

          Why cant we use DFSClient with HA here directly? Like how current clients connects to HA, each subcluster can be connected to a subcluster using a HA configured DFSClient. DFSClient itself will handle switching between NNs in case of failover. This DFSClient can be kept as-is and re-used later for next requests on same subcluster. So it should know the current active namenode.

          To provide a fully federated view, we think is best to track the state of all the Namenodes. In this way, we can expose the federation view in the web UI. Given that we have this information, we can use this information as hints for the clients. Actually, there was some discussion in HDFS-7858 regarding using the information in ZK to go faster to the Active namenode. This was discarded because of the additional complexity. I think this might be a good opportunity to go in that direction. Our current implementation (using the Active hint) is faster than the regular fail over and produces less load than the hedging approach.

          How about supporting a default Mount-point for '/' as well. Could be optional also? Instead of rejecting requests for the paths which doesnt match with other other mountpoints.

          There might be some usecases where there might be multiple first level directories other than mounted path. Which could go under /.

          Yes, this is a common use case. We already support a default / set using dfs.router.default.nameserviceId. We may want to make it more explicit/clear.

          Show
          elgoiri Íñigo Goiri added a comment - Thanks Vinayakumar B for the comments. Why cant we use DFSClient with HA here directly? Like how current clients connects to HA, each subcluster can be connected to a subcluster using a HA configured DFSClient. DFSClient itself will handle switching between NNs in case of failover. This DFSClient can be kept as-is and re-used later for next requests on same subcluster. So it should know the current active namenode. To provide a fully federated view, we think is best to track the state of all the Namenodes. In this way, we can expose the federation view in the web UI. Given that we have this information, we can use this information as hints for the clients. Actually, there was some discussion in HDFS-7858 regarding using the information in ZK to go faster to the Active namenode. This was discarded because of the additional complexity. I think this might be a good opportunity to go in that direction. Our current implementation (using the Active hint) is faster than the regular fail over and produces less load than the hedging approach. How about supporting a default Mount-point for '/' as well. Could be optional also? Instead of rejecting requests for the paths which doesnt match with other other mountpoints. There might be some usecases where there might be multiple first level directories other than mounted path. Which could go under /. Yes, this is a common use case. We already support a default / set using dfs.router.default.nameserviceId . We may want to make it more explicit/clear.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Regarding the rebalancing operations, currently we are proposing to disallow write accesses from the Routers. The problem is that then we have to disallow direct accesses to the Namenodes to prevent writes at that level. For this reason, we could leverage the concept of immutable folders/files from HDFS-3154 and more recently HDFS-7568. Not sure how likely are those efforts are to move forward though.

          Show
          elgoiri Íñigo Goiri added a comment - Regarding the rebalancing operations, currently we are proposing to disallow write accesses from the Routers. The problem is that then we have to disallow direct accesses to the Namenodes to prevent writes at that level. For this reason, we could leverage the concept of immutable folders/files from HDFS-3154 and more recently HDFS-7568 . Not sure how likely are those efforts are to move forward though.
          Hide
          subru Subru Krishnan added a comment - - edited

          Thanks Jason Kace and Inigo for the refactored patch. I made a quick pass in the context of HADOOP-13378 and I think that we can represent the YARN FederationStateStore using the generic StateStoreDriver you guys have defined. Personally I prefer the push mechanism we have in YARN-3671 as it's much simpler than the pull mechanism proposed here though I do agree both achieve the same result.

          A couple of comments based on my quick scan:

          • We should add versioning to the generic StateStoreDriver. Refer FederationStateStore.
          • Use jcache instead of writing custom key-caches based on ConcurrentHashMaps. In fact, I feel we can refactor the (jcache-based) cache in FederationStateStoreFacade and use it across both efforts.
          • We should reuse the RecordFactory and supporting infrastructure from YARN as opposed to coming up with a parallel structure in HDFS.
          • Lastly I would suggest using Curator for ZooKeeper implementation as we have moved to it in YARN (YARN-4438 and follow up work).
          Show
          subru Subru Krishnan added a comment - - edited Thanks Jason Kace and Inigo for the refactored patch. I made a quick pass in the context of HADOOP-13378 and I think that we can represent the YARN FederationStateStore using the generic StateStoreDriver you guys have defined. Personally I prefer the push mechanism we have in YARN-3671 as it's much simpler than the pull mechanism proposed here though I do agree both achieve the same result. A couple of comments based on my quick scan: We should add versioning to the generic StateStoreDriver . Refer FederationStateStore . Use jcache instead of writing custom key-caches based on ConcurrentHashMaps . In fact, I feel we can refactor the (jcache-based) cache in FederationStateStoreFacade and use it across both efforts. We should reuse the RecordFactory and supporting infrastructure from YARN as opposed to coming up with a parallel structure in HDFS. Lastly I would suggest using Curator for ZooKeeper implementation as we have moved to it in YARN ( YARN-4438 and follow up work).
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 patch 0m 6s HDFS-10467 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help.



          Subsystem Report/Notes
          JIRA Issue HDFS-10467
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16828/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 patch 0m 6s HDFS-10467 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. Subsystem Report/Notes JIRA Issue HDFS-10467 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16828/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Subru Krishnan, I agree on RecordFactory. I just created HADOOP-13642 to move this from YARN to Common.
          Thanks for the feedback!

          Show
          elgoiri Íñigo Goiri added a comment - Subru Krishnan , I agree on RecordFactory . I just created HADOOP-13642 to move this from YARN to Common. Thanks for the feedback!
          Hide
          jakace Jason Kace added a comment -

          Subru Krishnan, than you for the feedback!

          1) Using jcache for the query caches is a good idea. A TODO I have is to increase the scalability of the caches and/or to prune older entries. jcache seems to handle these well. I'll check out YARN to see if there is a cache manager we can reuse. For the internal caches of state store records, I'm not convinced jcache provides any benefits as these caches are closely synchronized with internal data structures such as the tree representation of the mount table, etc.

          2) I'll work on curator for ZK. It will simplify the codebase and connection management.

          3) I'll add versioning. For HDFS federation, there are multiple APIs implemented in different classes, I recommend that each of these are versioned (i.e. Registration, MountTable, RouterState, Rebalancer, etc). The driver interface is separate from the interface APIs and should also be versioned. Each of the data records and/or API request/response objects can potentially be versioned, but I think it is best to keep their version tied to the interface API as each has a 1:many relationship between the interface:object.

          Show
          jakace Jason Kace added a comment - Subru Krishnan , than you for the feedback! 1) Using jcache for the query caches is a good idea. A TODO I have is to increase the scalability of the caches and/or to prune older entries. jcache seems to handle these well. I'll check out YARN to see if there is a cache manager we can reuse. For the internal caches of state store records, I'm not convinced jcache provides any benefits as these caches are closely synchronized with internal data structures such as the tree representation of the mount table, etc. 2) I'll work on curator for ZK. It will simplify the codebase and connection management. 3) I'll add versioning. For HDFS federation, there are multiple APIs implemented in different classes, I recommend that each of these are versioned (i.e. Registration, MountTable, RouterState, Rebalancer, etc). The driver interface is separate from the interface APIs and should also be versioned. Each of the data records and/or API request/response objects can potentially be versioned, but I think it is best to keep their version tied to the interface API as each has a 1:many relationship between the interface:object.
          Hide
          elgoiri Íñigo Goiri added a comment -

          I will do a fork in GitHub to place our code running in production for people to try.
          What version should I use as base? trunk? 2.8? branch-2?

          Show
          elgoiri Íñigo Goiri added a comment - I will do a fork in GitHub to place our code running in production for people to try. What version should I use as base? trunk? 2.8? branch-2?
          Hide
          elgoiri Íñigo Goiri added a comment -
          Show
          elgoiri Íñigo Goiri added a comment - Created fork to branch-2.6.1: https://github.com/goiri/hadoop/tree/branch-2.6.1-hdfs-router
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 patch 0m 11s HDFS-10467 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help.



          Subsystem Report/Notes
          JIRA Issue HDFS-10467
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17784/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 patch 0m 11s HDFS-10467 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. Subsystem Report/Notes JIRA Issue HDFS-10467 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17784/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 patch 0m 17s HDFS-10467 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help.



          Subsystem Report/Notes
          JIRA Issue HDFS-10467
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18772/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 patch 0m 17s HDFS-10467 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. Subsystem Report/Notes JIRA Issue HDFS-10467 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12815804/HDFS-10467.PoC.001.patch Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18772/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          fabbri Aaron Fabbri added a comment -

          Thanks for the update Íñigo Goiri. I'm trying to get a feel for the overall progress of this. Are there any work items that are not already covered in the subtasks here? Any other details on how much work is left, or when you expect to have basic features completed, is welcomed.

          Show
          fabbri Aaron Fabbri added a comment - Thanks for the update Íñigo Goiri . I'm trying to get a feel for the overall progress of this. Are there any work items that are not already covered in the subtasks here? Any other details on how much work is left, or when you expect to have basic features completed, is welcomed.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Aaron Fabbri, the tasks in the current JIRA are the basic ones to get the Router-based federation working.
          There are a bunch of them that we can add:

          • Web interface
          • Metrics system
          • Router heartbeating
          • Router safe mode
          • Rebalancing

          All these are already implemented and is running in our clusters.
          There is a couple months ago version available at:
          https://github.com/goiri/hadoop/tree/branch-2.6.1-hdfs-router
          (I can update with the latest if needed.)

          At this point is a matter of reviewing the code in the subtasks.
          It's hard to give a time frame but having reviews; so any reviews on the subtasks is highly appreciated.

          Show
          elgoiri Íñigo Goiri added a comment - Aaron Fabbri , the tasks in the current JIRA are the basic ones to get the Router-based federation working. There are a bunch of them that we can add: Web interface Metrics system Router heartbeating Router safe mode Rebalancing All these are already implemented and is running in our clusters. There is a couple months ago version available at: https://github.com/goiri/hadoop/tree/branch-2.6.1-hdfs-router (I can update with the latest if needed.) At this point is a matter of reviewing the code in the subtasks. It's hard to give a time frame but having reviews; so any reviews on the subtasks is highly appreciated.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Attaching patch with the status of the HDFS-10467 by August 31st for merge discussion.

          Show
          elgoiri Íñigo Goiri added a comment - Attaching patch with the status of the HDFS-10467 by August 31st for merge discussion.
          Hide
          elgoiri Íñigo Goiri added a comment -

          He Tianyi, now that we start to have most of the patches ready and we are discussing what would take to merge the branch into trunk, would you mind taking a look at the code?
          Let me know if there is any feature from NNProxy you think should be covered here.

          Show
          elgoiri Íñigo Goiri added a comment - He Tianyi , now that we start to have most of the patches ready and we are discussing what would take to merge the branch into trunk, would you mind taking a look at the code? Let me know if there is any feature from NNProxy you think should be covered here.
          Hide
          He Tianyi He Tianyi added a comment -

          Íñigo Goiri, thanks for asking.
          our federated cluster has grown to 7000+ nodes this year. I can share some lessons learned in production with nnproxy:

          • can speed up data rebalance between subclusters with 'fastcopy', or similar method, which effectively reduces resource consumption when there is intensive rebalance work
          • perhaps isolation between subclusters for request forwarding on single router is required, otherwise outage of any subcluster could also affect others (from client's point of view) due to shared resource, i.e. thread pool (ipc handlers), client connection pool (ipc client). we done this by implementing a fully nonblocking version of proxy, also use multiple client connections to forward requests (as HADOOP-13144 suggests)
          • global quota: we've disabled quota on each NameNode. quota is computed by a separated service which reads/tails fsimage and editlog from all subclusters, while nnproxy plays the part of enforcing quota (rejecting to create file when usage exceeds limitation, for example).
          Show
          He Tianyi He Tianyi added a comment - Íñigo Goiri , thanks for asking. our federated cluster has grown to 7000+ nodes this year. I can share some lessons learned in production with nnproxy: can speed up data rebalance between subclusters with 'fastcopy', or similar method, which effectively reduces resource consumption when there is intensive rebalance work perhaps isolation between subclusters for request forwarding on single router is required, otherwise outage of any subcluster could also affect others (from client's point of view) due to shared resource, i.e. thread pool (ipc handlers), client connection pool (ipc client). we done this by implementing a fully nonblocking version of proxy, also use multiple client connections to forward requests (as HADOOP-13144 suggests) global quota: we've disabled quota on each NameNode. quota is computed by a separated service which reads/tails fsimage and editlog from all subclusters, while nnproxy plays the part of enforcing quota (rejecting to create file when usage exceeds limitation, for example).
          Hide
          litao1990 Tao Li added a comment -

          Tao Li liked your email
          Spark by Readdle

          Show
          litao1990 Tao Li added a comment - Tao Li liked your email Spark by Readdle
          Hide
          elgoiri Íñigo Goiri added a comment -

          There are a couple of improvements for HDFS-12273 that would go into separate JIRAs.
          As they are not required for the merge, I'm thinking on adding them out of this umbrella with the suffix RBF (for Router Based Federation).

          Show
          elgoiri Íñigo Goiri added a comment - There are a couple of improvements for HDFS-12273 that would go into separate JIRAs. As they are not required for the merge, I'm thinking on adding them out of this umbrella with the suffix RBF (for Router Based Federation).
          Hide
          elgoiri Íñigo Goiri added a comment -

          The vote for merging is finishing tomorrow. At this point, the only thing open is the issue with the names described in HDFS-12577. The security patch can be moved to v2 (I'll open a new umbrella with the remaining issues).

          Show
          elgoiri Íñigo Goiri added a comment - The vote for merging is finishing tomorrow. At this point, the only thing open is the issue with the names described in HDFS-12577 . The security patch can be moved to v2 (I'll open a new umbrella with the remaining issues).
          Hide
          elgoiri Íñigo Goiri added a comment -

          The vote passed so I merged HDFS-10467 into trunk and branch-3.0.

          Andrew Wang, I compiled and tested both branch and seem to work.
          Let me know if there were any issues.
          I'm pretty sure I may have forgotten to close the JIRA properly (target versions and so on) so feel free to clean that up if so.

          With this, I complete HDFS-10467 with all sub-tasks closed and move to phase 2 in HDFS-12615.

          Thanks everybody for the comments!

          Show
          elgoiri Íñigo Goiri added a comment - The vote passed so I merged HDFS-10467 into trunk and branch-3.0. Andrew Wang , I compiled and tested both branch and seem to work. Let me know if there were any issues. I'm pretty sure I may have forgotten to close the JIRA properly (target versions and so on) so feel free to clean that up if so. With this, I complete HDFS-10467 with all sub-tasks closed and move to phase 2 in HDFS-12615 . Thanks everybody for the comments!
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13045 (See https://builds.apache.org/job/Hadoop-trunk-Commit/13045/)
          HDFS-12223. Rebasing HDFS-10467. Contributed by Inigo Goiri. (inigoiri: rev 0ec82b8cdfaaa5f23d1a0f7f7fb8c9187c5e309b)

          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
            HDFS-12312. Rebasing HDFS-10467 (2). Contributed by Inigo Goiri. (inigoiri: rev 346c9fce43ebf6a90fc56e0dc7c403f97cc5391f)
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
            HDFS-12430. Rebasing HDFS-10467 After HDFS-12269 and HDFS-12218. (inigoiri: rev 1f06b81ecb14044964176dd16fafaa0ee96bfe3d)
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
            HDFS-12580. Rebasing HDFS-10467 after HDFS-12447. Contributed by Inigo (inigoiri: rev 6c69e23dcdf1cdbddd47bacdf2dace5c9f06e3ad)
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13045 (See https://builds.apache.org/job/Hadoop-trunk-Commit/13045/ ) HDFS-12223 . Rebasing HDFS-10467 . Contributed by Inigo Goiri. (inigoiri: rev 0ec82b8cdfaaa5f23d1a0f7f7fb8c9187c5e309b) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java HDFS-12312 . Rebasing HDFS-10467 (2). Contributed by Inigo Goiri. (inigoiri: rev 346c9fce43ebf6a90fc56e0dc7c403f97cc5391f) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs HDFS-12430 . Rebasing HDFS-10467 After HDFS-12269 and HDFS-12218 . (inigoiri: rev 1f06b81ecb14044964176dd16fafaa0ee96bfe3d) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java HDFS-12580 . Rebasing HDFS-10467 after HDFS-12447 . Contributed by Inigo (inigoiri: rev 6c69e23dcdf1cdbddd47bacdf2dace5c9f06e3ad) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
          Hide
          andrew.wang Andrew Wang added a comment -

          Thanks for working on this Inigo. Do you mind adding a release note to this JIRA? We should also update hadoop-project/src/site/markdown/index.md.vm with links to the docs.

          Show
          andrew.wang Andrew Wang added a comment - Thanks for working on this Inigo. Do you mind adding a release note to this JIRA? We should also update hadoop-project/src/site/markdown/index.md.vm with links to the docs.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Thanks Andrew Wang.
          I updated the release notes for this JIRA and created HADOOP-14939 updating the index.md.vm.
          Not my finest piece of literature so feel free to suggest comments.

          Show
          elgoiri Íñigo Goiri added a comment - Thanks Andrew Wang . I updated the release notes for this JIRA and created HADOOP-14939 updating the index.md.vm . Not my finest piece of literature so feel free to suggest comments.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13061 (See https://builds.apache.org/job/Hadoop-trunk-Commit/13061/)
          HADOOP-14939. Update project release notes with HDFS-10467 for 3.0.0. (wang: rev 132cdac0ddb5c38205a96579a23b55689ea5a8e3)

          • (edit) hadoop-project/src/site/markdown/index.md.vm
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13061 (See https://builds.apache.org/job/Hadoop-trunk-Commit/13061/ ) HADOOP-14939 . Update project release notes with HDFS-10467 for 3.0.0. (wang: rev 132cdac0ddb5c38205a96579a23b55689ea5a8e3) (edit) hadoop-project/src/site/markdown/index.md.vm

            People

            • Assignee:
              elgoiri Íñigo Goiri
              Reporter:
              elgoiri Íñigo Goiri
            • Votes:
              0 Vote for this issue
              Watchers:
              67 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development