HBase
  1. HBase
  2. HBASE-6435

Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.95.2
    • Fix Version/s: 0.95.0
    • Component/s: master, regionserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      This JIRA adds a hook in the HDFS client to reorder the replica locations for HLog files. The default ordering in HDFS is rack aware + random. When reading a HLog file, we prefer not to use the replica on the same server as the region server that wrote the HLog: this server is likely to be not available, and this will delay the HBase recovery by one minute. This occurs because the recovery starts sooner in HBase than in HDFS: 3 minutes by default in HBase vs. 10:30 minutes in HDFS. This will be changed in HDFS-3703. Moreover, when a HDFS file is already opened for writing, a read triggers another call to get the file size, leading to another timeout (see HDFS-3704), but as well a wrong file size value (see HDFS-3701 and HBASE-6401). Technically:
      - this hook won't be useful anymore when HDFS-3702 or HDFS-3705 or HDFS-3706 is available and used in HBase.
      - the hook intercepts the calls to the nanemode and reorder the locations it returned, extracting the region server name from the HLog file. This server is put at the end of the list, ensuring it will be tried only if all the others fail.
      - It has been tested with HDFS 1.0.3. of HDFS 2.0 apha.
      - It can be deactivated (at master & region server start-up) by setting "hbase.filesystem.reorder.blocks" to false in the HBase configuration.
      Show
      This JIRA adds a hook in the HDFS client to reorder the replica locations for HLog files. The default ordering in HDFS is rack aware + random. When reading a HLog file, we prefer not to use the replica on the same server as the region server that wrote the HLog: this server is likely to be not available, and this will delay the HBase recovery by one minute. This occurs because the recovery starts sooner in HBase than in HDFS: 3 minutes by default in HBase vs. 10:30 minutes in HDFS. This will be changed in HDFS-3703 . Moreover, when a HDFS file is already opened for writing, a read triggers another call to get the file size, leading to another timeout (see HDFS-3704 ), but as well a wrong file size value (see HDFS-3701 and HBASE-6401 ). Technically: - this hook won't be useful anymore when HDFS-3702 or HDFS-3705 or HDFS-3706 is available and used in HBase. - the hook intercepts the calls to the nanemode and reorder the locations it returned, extracting the region server name from the HLog file. This server is put at the end of the list, ensuring it will be tried only if all the others fail. - It has been tested with HDFS 1.0.3. of HDFS 2.0 apha. - It can be deactivated (at master & region server start-up) by setting "hbase.filesystem.reorder.blocks" to false in the HBase configuration.
    • Tags:
      0.96notable

      Description

      HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written on hdfs.
      Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process.
      This means reading the Write-Ahead-Log to replay the edits on the other servers.

      In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes.

      It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode.

      As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN.

      Possible solutions are:

      • shorter dead datanodes detection by the NN. Requires a NN code change.
      • better dead datanodes management in DFSClient. Requires a DFS code change.
      • NN customisation to write the WAL files on another DN instead of the local one.
      • reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround.

      The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons:

      • Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean.
      • Adding a proxy allows to put all the code in HBase, simplifying dependency management.

      Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods.

      Moreover, writing the blocks to the non local DN would be an even better solution long term.

      1. 6435.unfinished.patch
        14 kB
        Nicolas Liochon
      2. 6435.v10.patch
        34 kB
        Nicolas Liochon
      3. 6435.v10.patch
        34 kB
        Nicolas Liochon
      4. 6435.v12.patch
        39 kB
        Nicolas Liochon
      5. 6435.v12.patch
        39 kB
        Nicolas Liochon
      6. 6435.v12.patch
        39 kB
        Nicolas Liochon
      7. 6435.v13.patch
        39 kB
        Nicolas Liochon
      8. 6435.v14.patch
        39 kB
        Nicolas Liochon
      9. 6435.v2.patch
        31 kB
        Nicolas Liochon
      10. 6435.v7.patch
        33 kB
        Nicolas Liochon
      11. 6435.v8.patch
        34 kB
        Nicolas Liochon
      12. 6435.v9.patch
        34 kB
        Nicolas Liochon
      13. 6435.v9.patch
        34 kB
        Nicolas Liochon
      14. 6435-v12.txt
        39 kB
        Ted Yu
      15. 6535.v11.patch
        36 kB
        Nicolas Liochon

        Issue Links

          Activity

          Hide
          Nicolas Liochon added a comment -

          The patch is not finished. Actually, it contains for code for the hdfs hook and the related test, but not the code for defining the location order from the file name. But as it is different from what we initially discussed, I post it here in case someone sees something I missed.

          It does not mean it should not be fixed in hdfs as well, just that this is likely to be much simpler than patching the 1.0 branch...

          Show
          Nicolas Liochon added a comment - The patch is not finished. Actually, it contains for code for the hdfs hook and the related test, but not the code for defining the location order from the file name. But as it is different from what we initially discussed, I post it here in case someone sees something I missed. It does not mean it should not be fixed in hdfs as well, just that this is likely to be much simpler than patching the 1.0 branch...
          Hide
          Todd Lipcon added a comment -

          I'm -1 on this kind of hack going into HBase before we add the feature to HDFS. I agree that adding to HDFS proper means we have to wait for a release, but this kind of code is likely to be really fragile. Also, without HBase driving requirements of HDFS, it will never evolve to natively have these kind of features, and HBase will devolve into a mess of reflection hacks to change around the HDFS internals.

          Show
          Todd Lipcon added a comment - I'm -1 on this kind of hack going into HBase before we add the feature to HDFS. I agree that adding to HDFS proper means we have to wait for a release, but this kind of code is likely to be really fragile. Also, without HBase driving requirements of HDFS, it will never evolve to natively have these kind of features, and HBase will devolve into a mess of reflection hacks to change around the HDFS internals.
          Hide
          stack added a comment -

          Yeah, we should do both (I'd think that whats added to HDFS is more general than just this workaround scheme where local gets moved to the end of the list; i.e. we add being able to intercept the order returned by the NN and let a client-side policy alter it based on "local knowledge" if wanted.... Could add other customizations like being able to set timeout per DFSInput/OutputStream as you've suggested up on dev list N). Would be sweet if the 'hack' were available meantime while we wait on an hdfs release.

          Looking at patch, looks like inventive hackery; good on you.

          Do we have to do this in both master and regionserver? Can't do it in HFileSystem constructor assuming it takes a Conf (or that'd be too late?)

          + HFileSystem.addLocationOrderHack(conf);

          Rather than have it called a reorderProxy, call it an HBaseDFSClient? Might want to add more customizations while waiting on HDFS fix to arrive.

          Show
          stack added a comment - Yeah, we should do both (I'd think that whats added to HDFS is more general than just this workaround scheme where local gets moved to the end of the list; i.e. we add being able to intercept the order returned by the NN and let a client-side policy alter it based on "local knowledge" if wanted.... Could add other customizations like being able to set timeout per DFSInput/OutputStream as you've suggested up on dev list N). Would be sweet if the 'hack' were available meantime while we wait on an hdfs release. Looking at patch, looks like inventive hackery; good on you. Do we have to do this in both master and regionserver? Can't do it in HFileSystem constructor assuming it takes a Conf (or that'd be too late?) + HFileSystem.addLocationOrderHack(conf); Rather than have it called a reorderProxy, call it an HBaseDFSClient? Might want to add more customizations while waiting on HDFS fix to arrive.
          Hide
          Nicolas Liochon added a comment -

          My thinking was it could make it on a hdfs release that accepts changing public interfaces. I fully agree with you Todd, we need to do our homeworks and push hdfs to ensure that what we need is understood and makes it to a release. On the other hand, if I look at how it worked for much simpler stuff like JUnit and surefire, our changes are in theie trunk for a few months and we're still waiting. These things take time. But I will do my homeworks on hdfs, I promise (I may need your help actually). The Jira will be created next week and if I have enough feedback I will propose a patch.

          I was also wondering if proposing natively to have interceptors would not be interesting for hdfs. It was available a long time in an orb called orbix and was great to use. But they would need to be per conf, so cannot be available with static stuff.

          Do we have to do this in both master and regionserver? Can't do it in HFileSystem constructor assuming it takes a Conf (or that'd be too late?)

          It can be put pretty late, basically before we start a recovery process. But we don't want it client side, so I will check this.

          Rather than have it called a reorderProxy, call it an HBaseDFSClient? Might want to add more customizations while waiting on HDFS fix to arrive.

          I've intercepted a lower level call: I'm between the DFSClient and the namenode. This because the DFSClient does more than just transferring calls: it contains some logic. Hence going in front of the namenode. But yes, I could make it more generic.

          Show
          Nicolas Liochon added a comment - My thinking was it could make it on a hdfs release that accepts changing public interfaces. I fully agree with you Todd, we need to do our homeworks and push hdfs to ensure that what we need is understood and makes it to a release. On the other hand, if I look at how it worked for much simpler stuff like JUnit and surefire, our changes are in theie trunk for a few months and we're still waiting. These things take time. But I will do my homeworks on hdfs, I promise (I may need your help actually). The Jira will be created next week and if I have enough feedback I will propose a patch. I was also wondering if proposing natively to have interceptors would not be interesting for hdfs. It was available a long time in an orb called orbix and was great to use. But they would need to be per conf, so cannot be available with static stuff. Do we have to do this in both master and regionserver? Can't do it in HFileSystem constructor assuming it takes a Conf (or that'd be too late?) It can be put pretty late, basically before we start a recovery process. But we don't want it client side, so I will check this. Rather than have it called a reorderProxy, call it an HBaseDFSClient? Might want to add more customizations while waiting on HDFS fix to arrive. I've intercepted a lower level call: I'm between the DFSClient and the namenode. This because the DFSClient does more than just transferring calls: it contains some logic. Hence going in front of the namenode. But yes, I could make it more generic.
          Hide
          Todd Lipcon added a comment -

          I think there's a good motivation to add these kind of APIs generally to DFSInputStream. In particular, I think something like the following:

          public List<Replica> getAvailableReplica(long pos); // return the list of available replicas at given file offset, in priority order
          public void prioritizeReplica(Replica r); // move given replica to front of list
          public void blacklistReplica(Replica r); // move replica to back of list
          (or something of this sort)

          The Replica API would then expose the datanode IDs (and after HDFS-3672, the disk ID).
          So, in HBase we could simply open the file, enumerate the replicas, deprioritize the one on the suspected node, and move on with the normal code paths.

          Show
          Todd Lipcon added a comment - I think there's a good motivation to add these kind of APIs generally to DFSInputStream. In particular, I think something like the following: public List<Replica> getAvailableReplica(long pos); // return the list of available replicas at given file offset, in priority order public void prioritizeReplica(Replica r); // move given replica to front of list public void blacklistReplica(Replica r); // move replica to back of list (or something of this sort) The Replica API would then expose the datanode IDs (and after HDFS-3672 , the disk ID). So, in HBase we could simply open the file, enumerate the replicas, deprioritize the one on the suspected node, and move on with the normal code paths.
          Hide
          Nicolas Liochon added a comment -

          I understand that you don't want to expose the internal nor something like the DatanodeInfo. The same type of API would be useful for the outputstream, putting priorities on nodes (and so reusing some knowledge for the dead nodes, or, for the wal, remove the local writes). It simple and efficient.

          With the current DFSClient implementation, a callback would ease cases like opening a file already opened for writing, or when a node list is cleared when they all failed. But may be it can be changed as well.

          Show
          Nicolas Liochon added a comment - I understand that you don't want to expose the internal nor something like the DatanodeInfo. The same type of API would be useful for the outputstream, putting priorities on nodes (and so reusing some knowledge for the dead nodes, or, for the wal, remove the local writes). It simple and efficient. With the current DFSClient implementation, a callback would ease cases like opening a file already opened for writing, or when a node list is cleared when they all failed. But may be it can be changed as well.
          Hide
          Todd Lipcon added a comment -

          With the current DFSClient implementation, a callback would ease cases like opening a file already opened for writing, or when a node list is cleared when they all failed. But may be it can be changed as well.

          Can you explain further what you mean here? What would you use these callbacks for?

          Show
          Todd Lipcon added a comment - With the current DFSClient implementation, a callback would ease cases like opening a file already opened for writing, or when a node list is cleared when they all failed. But may be it can be changed as well. Can you explain further what you mean here? What would you use these callbacks for?
          Hide
          Nicolas Liochon added a comment -

          If I can to keep the existing interface

          Today, when you open a file, there is a call to a datanode if the file is also opened for writing somewhere. In HBase, we want the priorities to be taken into account during this opening, as we have a guess that one of these datanode may be dead.

          So either I register a callback that the DFSClient will call before using its list, either I change the 'open' interface to add the possibility to provide the list of replicas. Same thing for chooseDataNode called from blockSeekTo: even if we have a list at the beginning, this list is recreated during a read as a part of the retry process (in case the NN discovered new replicas on new datanodes).

          if we put a callback like

          We would offer this service.

          class  ReplicaSet {
            public List<Replica> getAvailableReplica(long pos); // return the list of available replicas at given file offset, in priority order
            public void prioritizeReplica(Replica r); // move given replica to front of list
            public void blacklistReplica(Replica r); // move replica to back of list
          }
          

          The client would need to implement this interface:

          // Implement this interface and provide it to the DFSClient during its construction to manage the replica ordering
          interface OrganizeReplicaSet{
           void organize(String fileName, ReplicaSet rs); 
          }
          

          And the DFSClient code would become:

          LocatedBlocks callGetBlockLocations(ClientProtocol namenode,
                String src, long start, long length) throws IOException {
              try {
                  LocatedBlocks lbs = namenode.getBlockLocations(src, start, length);
                  if (organizeReplicaSet != null){
                      ReplicaSet rs = LocatedBlocks.getAsReplicaSet()
                      try {
                          organizeReplicaSet.organize(src, rs);
                      }catch (Throwable t){
                          throw new IOException("ClientBlockReordorer failed. class="+reorderer.getClass(), t);
                      }
                      return new LocatedBlocks(rs);
                  } else
                    return lbs;
          

          This is called from the DFSInputStream constructor in openInfo today.

          In real life I would try to use the class ReplicaSet as an interface on the internal LocatedBlock(s) to limit the number of objects created. The callback could also be given as a parameter to the DFSInputStream constructor if a there is a specific rule to apply...

          Show
          Nicolas Liochon added a comment - If I can to keep the existing interface Today, when you open a file, there is a call to a datanode if the file is also opened for writing somewhere. In HBase, we want the priorities to be taken into account during this opening, as we have a guess that one of these datanode may be dead. So either I register a callback that the DFSClient will call before using its list, either I change the 'open' interface to add the possibility to provide the list of replicas. Same thing for chooseDataNode called from blockSeekTo: even if we have a list at the beginning, this list is recreated during a read as a part of the retry process (in case the NN discovered new replicas on new datanodes). if we put a callback like We would offer this service. class ReplicaSet { public List<Replica> getAvailableReplica(long pos); // return the list of available replicas at given file offset, in priority order public void prioritizeReplica(Replica r); // move given replica to front of list public void blacklistReplica(Replica r); // move replica to back of list } The client would need to implement this interface: // Implement this interface and provide it to the DFSClient during its construction to manage the replica ordering interface OrganizeReplicaSet{ void organize(String fileName, ReplicaSet rs); } And the DFSClient code would become: LocatedBlocks callGetBlockLocations(ClientProtocol namenode, String src, long start, long length) throws IOException { try { LocatedBlocks lbs = namenode.getBlockLocations(src, start, length); if (organizeReplicaSet != null){ ReplicaSet rs = LocatedBlocks.getAsReplicaSet() try { organizeReplicaSet.organize(src, rs); }catch (Throwable t){ throw new IOException("ClientBlockReordorer failed. class="+reorderer.getClass(), t); } return new LocatedBlocks(rs); } else return lbs; This is called from the DFSInputStream constructor in openInfo today. In real life I would try to use the class ReplicaSet as an interface on the internal LocatedBlock(s) to limit the number of objects created. The callback could also be given as a parameter to the DFSInputStream constructor if a there is a specific rule to apply...
          Hide
          stack added a comment -

          @Todd Given suggested Interface, how we map from an hbase session expiration to a Replica? What if the DN died but RS didn't? Won't the fact that DFSClient under the wraps is banging its head timingout against a dead DN – once per DFSInputStream – be hidden from the RS since its being handled down in DFSClient? Don't we need more knowledge on DFSClient workings than suggested API exposes if we are to avoid dead DNs? If we do figure we have a bad DN, do we then per open DFSInputStream iterate updating priorities?

          Show
          stack added a comment - @Todd Given suggested Interface, how we map from an hbase session expiration to a Replica? What if the DN died but RS didn't? Won't the fact that DFSClient under the wraps is banging its head timingout against a dead DN – once per DFSInputStream – be hidden from the RS since its being handled down in DFSClient? Don't we need more knowledge on DFSClient workings than suggested API exposes if we are to avoid dead DNs? If we do figure we have a bad DN, do we then per open DFSInputStream iterate updating priorities?
          Hide
          Todd Lipcon added a comment -

          Good points. We should probably move this discussion over to an HDFS JIRA. Having a global DFSClient-wide ability to mark nodes un-preferred is probably advantageous.

          Show
          Todd Lipcon added a comment - Good points. We should probably move this discussion over to an HDFS JIRA. Having a global DFSClient-wide ability to mark nodes un-preferred is probably advantageous.
          Hide
          Nicolas Liochon added a comment -

          v2. May need some clean up on logs + a check to unactivate it for hadoop 2 for example.

          Show
          Nicolas Liochon added a comment - v2. May need some clean up on logs + a check to unactivate it for hadoop 2 for example.
          Hide
          Nicolas Liochon added a comment -

          + I need to test it on a real cluster (emulating locations on a mini cluster can be dangerous...)

          Show
          Nicolas Liochon added a comment - + I need to test it on a real cluster (emulating locations on a mini cluster can be dangerous...)
          Hide
          Nicolas Liochon added a comment -

          Tested on a real cluster by adding validation code on a region server, went ok. I don't have a real idea on how to activate it just for some hadoop versions, so I will do a last clean-up on the logs and propose a final version.

          Show
          Nicolas Liochon added a comment - Tested on a real cluster by adding validation code on a region server, went ok. I don't have a real idea on how to activate it just for some hadoop versions, so I will do a last clean-up on the logs and propose a final version.
          Hide
          Nicolas Liochon added a comment -

          Ok for review...

          Show
          Nicolas Liochon added a comment - Ok for review...
          Hide
          Ted Yu added a comment -

          Just started to look at the patch.
          It doesn't compile against hadoop 2.0:

          [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project hbase-server: Compilation failure: Compilation failure:
          [ERROR] /Users/zhihyu/trunk-hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:[214,12] namenode is not public in org.apache.hadoop.hdfs.DFSClient; cannot be accessed from outside package
          [ERROR] 
          [ERROR] /Users/zhihyu/trunk-hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:[221,52] namenode is not public in org.apache.hadoop.hdfs.DFSClient; cannot be accessed from outside package
          [ERROR] 
          [ERROR] /Users/zhihyu/trunk-hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:[289,81] cannot find symbol
          [ERROR] symbol  : method getHost()
          [ERROR] location: class org.apache.hadoop.hdfs.protocol.DatanodeInfo
          

          Can we give the following a more meaningful name ?

          +    if (!conf.getBoolean("hbase.hdfs.jira6435", true)){  // activated by default
          

          Comment from Todd would be appreciated.

          Show
          Ted Yu added a comment - Just started to look at the patch. It doesn't compile against hadoop 2.0: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile ( default -compile) on project hbase-server: Compilation failure: Compilation failure: [ERROR] /Users/zhihyu/trunk-hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:[214,12] namenode is not public in org.apache.hadoop.hdfs.DFSClient; cannot be accessed from outside package [ERROR] [ERROR] /Users/zhihyu/trunk-hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:[221,52] namenode is not public in org.apache.hadoop.hdfs.DFSClient; cannot be accessed from outside package [ERROR] [ERROR] /Users/zhihyu/trunk-hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:[289,81] cannot find symbol [ERROR] symbol : method getHost() [ERROR] location: class org.apache.hadoop.hdfs.protocol.DatanodeInfo Can we give the following a more meaningful name ? + if (!conf.getBoolean( "hbase.hdfs.jira6435" , true )){ // activated by default Comment from Todd would be appreciated.
          Hide
          Nicolas Liochon added a comment -

          I will have a look at the hadoop2 stuff.

          for

          Can we give the following a more meaningful name ?

          Do you have an idea?

          Show
          Nicolas Liochon added a comment - I will have a look at the hadoop2 stuff. for Can we give the following a more meaningful name ? Do you have an idea?
          Hide
          Ted Yu added a comment -

          How about 'hbase.filesystem.reorder.blocks' ?

          BTW replacing 'Hack' with some form of 'Intercept' would be better IMHO.

          Show
          Ted Yu added a comment - How about 'hbase.filesystem.reorder.blocks' ? BTW replacing 'Hack' with some form of 'Intercept' would be better IMHO.
          Hide
          Nicolas Liochon added a comment -

          Ok.
          I wanted to make clear it was a temporary workaround.

          Show
          Nicolas Liochon added a comment - Ok. I wanted to make clear it was a temporary workaround.
          Hide
          Nicolas Liochon added a comment -

          v8 works ok with hadoop 1 & hadoop 2 and other Ted's comments. I tried the v3 profile, but got errors in the pom.xml.

          Show
          Nicolas Liochon added a comment - v8 works ok with hadoop 1 & hadoop 2 and other Ted's comments. I tried the v3 profile, but got errors in the pom.xml.
          Hide
          Ted Yu added a comment -
          +  private static ClientProtocol createReordoringProxy(final ClientProtocol cp,
          

          Usually spelling would be nit. But this spelling mistake was in method name

          +  public static ServerName getServerNameFromHLogDirectoryName(Configuration conf, String path) throws IOException {
          

          The above line is too long.

          +              LOG.debug("Moved the location "+toLast.getHostName()+" to the last place." +
          +                  " locations size was "+dnis.length);
          

          I think the above log may appear many times.

          +            LOG.fatal("AAAA REORDER");
          

          The above can be made a debug log.

          Show
          Ted Yu added a comment - + private static ClientProtocol createReordoringProxy( final ClientProtocol cp, Usually spelling would be nit. But this spelling mistake was in method name + public static ServerName getServerNameFromHLogDirectoryName(Configuration conf, String path) throws IOException { The above line is too long. + LOG.debug( "Moved the location " +toLast.getHostName()+ " to the last place." + + " locations size was " +dnis.length); I think the above log may appear many times. + LOG.fatal( "AAAA REORDER" ); The above can be made a debug log.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12538610/6435.v8.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 8 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings).

          -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.fs.TestBlockReorder

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538610/6435.v8.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.fs.TestBlockReorder Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2464//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          For the test failure:

          org.junit.ComparisonFailure: expected:<[localhost]> but was:<[host2]>
          	at org.junit.Assert.assertEquals(Assert.java:125)
          	at org.junit.Assert.assertEquals(Assert.java:147)
          	at org.apache.hadoop.hbase.fs.TestBlockReorder.testFromDFS(TestBlockReorder.java:320)
          	at org.apache.hadoop.hbase.fs.TestBlockReorder.testHBaseCluster(TestBlockReorder.java:271)
          

          testFromDFS() should have utilized the done flag for the while loop below:

          +        for (int y = 0; y < l.getLocatedBlocks().size() && done; y++) {
          +          done = (l.get(y).getLocations().length == 3);
          +        }
          +      } while (l.get(0).getLocations().length != 3);
          

          When l.getLocatedBlocks().size() is greater than 1, the above loop may exit prematurely.

          Show
          Ted Yu added a comment - For the test failure: org.junit.ComparisonFailure: expected:<[localhost]> but was:<[host2]> at org.junit.Assert.assertEquals(Assert.java:125) at org.junit.Assert.assertEquals(Assert.java:147) at org.apache.hadoop.hbase.fs.TestBlockReorder.testFromDFS(TestBlockReorder.java:320) at org.apache.hadoop.hbase.fs.TestBlockReorder.testHBaseCluster(TestBlockReorder.java:271) testFromDFS() should have utilized the done flag for the while loop below: + for ( int y = 0; y < l.getLocatedBlocks().size() && done; y++) { + done = (l.get(y).getLocations().length == 3); + } + } while (l.get(0).getLocations().length != 3); When l.getLocatedBlocks().size() is greater than 1, the above loop may exit prematurely.
          Hide
          Nicolas Liochon added a comment -

          Thanks for the review and the test failure analysis, Ted. v9 takes the comments into account.

          Show
          Nicolas Liochon added a comment - Thanks for the review and the test failure analysis, Ted. v9 takes the comments into account.
          Hide
          Ted Yu added a comment -

          From PreCommit build #2470, look like compilation against Hadoop 2.0 failed.

          Show
          Ted Yu added a comment - From PreCommit build #2470, look like compilation against Hadoop 2.0 failed.
          Hide
          Nicolas Liochon added a comment -

          Is there a way to have more info on the failure?
          Locally

          mvn test -Dhadoop.profile=2.0
          

          says

          Tests in error:
            testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)
            testWithDeletes(org.apache.hadoop.hbase.mapreduce.TestImportExport)
          
          Tests run: 719, Failures: 0, Errors: 2, Skipped: 2
          

          and .TestBlockReorder is ok (executed 5 times)

          Show
          Nicolas Liochon added a comment - Is there a way to have more info on the failure? Locally mvn test -Dhadoop.profile=2.0 says Tests in error: testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport) testWithDeletes(org.apache.hadoop.hbase.mapreduce.TestImportExport) Tests run: 719, Failures: 0, Errors: 2, Skipped: 2 and .TestBlockReorder is ok (executed 5 times)
          Hide
          Ted Yu added a comment -

          The compilation in PreCommit build was aborted.
          I couldn't reproduce the issue.

          Suggest re-attaching patch v9.

          Show
          Ted Yu added a comment - The compilation in PreCommit build was aborted. I couldn't reproduce the issue. Suggest re-attaching patch v9.
          Hide
          Nicolas Liochon added a comment -

          done

          Show
          Nicolas Liochon added a comment - done
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12538789/6435.v9.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 8 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings).

          -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestAdmin
          org.apache.hadoop.hbase.fs.TestBlockReorder

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538789/6435.v9.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.fs.TestBlockReorder Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//console This message is automatically generated.
          Hide
          Nicolas Liochon added a comment -

          I was expecting the name to be locahost, but it's not the case on hadoop-qa env:

          /asf011.sp2.ygridcore.net,43631,1343836299404/asf011.sp2.ygridcore.net%2C43631%2C1343836299404.1343836318993 is an HLog file, so reordering blocks, last hostname will be:asf011.sp2.ygridcore.net
          

          So the trick used to check location ordering on a mini cluster does not work. I will find another way...

          Show
          Nicolas Liochon added a comment - I was expecting the name to be locahost, but it's not the case on hadoop-qa env: /asf011.sp2.ygridcore.net,43631,1343836299404/asf011.sp2.ygridcore.net%2C43631%2C1343836299404.1343836318993 is an HLog file, so reordering blocks, last hostname will be:asf011.sp2.ygridcore.net So the trick used to check location ordering on a mini cluster does not work. I will find another way...
          Hide
          Ted Yu added a comment -

          From https://builds.apache.org/job/PreCommit-HBASE-Build/2479/console, it looks like compilation didn't pass for hadoop 2.0

          Show
          Ted Yu added a comment - From https://builds.apache.org/job/PreCommit-HBASE-Build/2479/console , it looks like compilation didn't pass for hadoop 2.0
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12538906/6435.v10.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 8 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings).

          -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.fs.TestBlockReorder

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538906/6435.v10.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.fs.TestBlockReorder Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2481//console This message is automatically generated.
          Hide
          Nicolas Liochon added a comment -

          ok, 127.0.0.1 is localhost, but it's not the name used by the minicluster... Will try again to trick the test...

          Show
          Nicolas Liochon added a comment - ok, 127.0.0.1 is localhost, but it's not the name used by the minicluster... Will try again to trick the test...
          Hide
          Nicolas Liochon added a comment -

          I had to change the test to make it more hadoop-qa friendly. In one of my numerous attempts, I added the possibility to start a miniCluster with a specific HMaster or HRegionServer class. I finally didn't use it here, but I kept it in the patch as it may be useful later...

          Show
          Nicolas Liochon added a comment - I had to change the test to make it more hadoop-qa friendly. In one of my numerous attempts, I added the possibility to start a miniCluster with a specific HMaster or HRegionServer class. I finally didn't use it here, but I kept it in the patch as it may be useful later...
          Hide
          Ted Yu added a comment -

          PreCommit build #2490 got aborted:

          /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/dev-support/test-patch.sh: line 353: 14017 Aborted                 $MVN clean test help:active-profiles -X -DskipTests -Dhadoop.profile=2.0 -D${PROJECT_NAME}PatchProcess > $PATCH_DIR/trunk2.0JavacWarnings.txt 2>&1
          
          Show
          Ted Yu added a comment - PreCommit build #2490 got aborted: /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/dev-support/test-patch.sh: line 353: 14017 Aborted $MVN clean test help:active-profiles -X -DskipTests -Dhadoop.profile=2.0 -D${PROJECT_NAME}PatchProcess > $PATCH_DIR/trunk2.0JavacWarnings.txt 2>&1
          Hide
          Ted Yu added a comment -

          Patch from N.

          Show
          Ted Yu added a comment - Patch from N.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12538981/6435.v12.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 11 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings).

          -1 findbugs. The patch appears to introduce 10 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestFromClientSide

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538981/6435.v12.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 10 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2494//console This message is automatically generated.
          Hide
          Nicolas Liochon added a comment -

          org.apache.hadoop.hbase.client.TestFromClientSide

          I think it's unrelated. Let's retry.

          Show
          Nicolas Liochon added a comment - org.apache.hadoop.hbase.client.TestFromClientSide I think it's unrelated. Let's retry.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12538999/6435.v12.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 11 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings).

          -1 findbugs. The patch appears to introduce 10 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538999/6435.v12.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 10 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2496//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          @N:
          Can you put patch on review board ?

          Thanks

          Show
          Ted Yu added a comment - @N: Can you put patch on review board ? Thanks
          Hide
          Nicolas Liochon added a comment -

          Ok. Tried. "Something broke! (Error 500)", I will retry later.

          Show
          Nicolas Liochon added a comment - Ok. Tried. "Something broke! (Error 500)", I will retry later.
          Hide
          Ted Yu added a comment -

          Same as N's patch v12.

          I was able to generate review on review board from this patch.

          Show
          Ted Yu added a comment - Same as N's patch v12. I was able to generate review on review board from this patch.
          Show
          Nicolas Liochon added a comment - https://reviews.apache.org/r/6522/
          Hide
          Nicolas Liochon added a comment -

          v13 takes into account the comments from the review board.

          Show
          Nicolas Liochon added a comment - v13 takes into account the comments from the review board.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12541051/6435.v13.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 11 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings).

          -1 findbugs. The patch appears to introduce 9 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
          org.apache.hadoop.hbase.master.TestMasterNoCluster

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541051/6435.v13.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 9 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.master.TestMasterNoCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2586//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          +1 on v13 if TestSplitTransactionOnCluster#testSplitBeforeSettingSplittingInZK passes.

          Show
          Ted Yu added a comment - +1 on v13 if TestSplitTransactionOnCluster#testSplitBeforeSettingSplittingInZK passes.
          Hide
          stack added a comment -

          +1 on commit. Some notes below that you can address on commit. Tests look good on cursory glance. Comprehensive. Nice hacking N.

          HMaster does an import of this:

          +import org.apache.hadoop.hbase.fs.HFileSystem;

          ... but not used. Fix on commit.

          Not important, but I'd check its DFS before I'd check reorder enabled flag. Next time.

          Whats this about?

          +      modifiersField.setInt(nf, nf.getModifiers() & ~Modifier.FINAL);

          Some of this patch probably belongs in compatibility layers. One day the reorder will be in hadoop.... We can address in new issue.

          What does this mean?

          + // We have a rack to get always the same location order but it does not work.

          Show
          stack added a comment - +1 on commit. Some notes below that you can address on commit. Tests look good on cursory glance. Comprehensive. Nice hacking N. HMaster does an import of this: +import org.apache.hadoop.hbase.fs.HFileSystem; ... but not used. Fix on commit. Not important, but I'd check its DFS before I'd check reorder enabled flag. Next time. Whats this about? + modifiersField.setInt(nf, nf.getModifiers() & ~Modifier.FINAL); Some of this patch probably belongs in compatibility layers. One day the reorder will be in hadoop.... We can address in new issue. What does this mean? + // We have a rack to get always the same location order but it does not work.
          Hide
          Nicolas Liochon added a comment -

          ... but not used. Fix on commit.

          Ok

          modifiersField.setInt(nf, nf.getModifiers() & ~Modifier.FINAL);

          The field is final, we're changing this as we're changing its value.

          + // We have a rack to get always the same location order but it does not work.

          I could remove it on commit... I wanted to use racks to have always the same order, but it does not work; the racks are not taken into account in this case, I don't know why...

          Thanks for the review Ted and Stack, I will commit it beginning of next week if I don't have another feedback.

          Show
          Nicolas Liochon added a comment - ... but not used. Fix on commit. Ok modifiersField.setInt(nf, nf.getModifiers() & ~Modifier.FINAL); The field is final, we're changing this as we're changing its value. + // We have a rack to get always the same location order but it does not work. I could remove it on commit... I wanted to use racks to have always the same order, but it does not work; the racks are not taken into account in this case, I don't know why... Thanks for the review Ted and Stack, I will commit it beginning of next week if I don't have another feedback.
          Hide
          stack added a comment -

          +1 on commit w/ above two edits. Needs nice fat release note. Good on you N.

          Show
          stack added a comment - +1 on commit w/ above two edits. Needs nice fat release note. Good on you N.
          Hide
          Nicolas Liochon added a comment -

          release notes done.

          Show
          Nicolas Liochon added a comment - release notes done.
          Hide
          stack added a comment -

          @N Nice note. You should write a blog on it and your other findings. You going to commit?

          Show
          stack added a comment - @N Nice note. You should write a blog on it and your other findings. You going to commit?
          Hide
          Nicolas Liochon added a comment -

          v14: version I'm going to commit as soon as the local tests (in progress) are ok.

          Show
          Nicolas Liochon added a comment - v14: version I'm going to commit as soon as the local tests (in progress) are ok.
          Hide
          Nicolas Liochon added a comment -

          Ok, local tests said:
          Tests in error:
          testGetRowVersions(org.apache.hadoop.hbase.TestMultiVersions): Shutting down
          testScanMultipleVersions(org.apache.hadoop.hbase.TestMultiVersions): org.apache.hadoop.hbase.MasterNotRunningException: Can create a proxy to master, but it is not running

          Not reproduced (tried once).

          Committed revision 1375451.

          Show
          Nicolas Liochon added a comment - Ok, local tests said: Tests in error: testGetRowVersions(org.apache.hadoop.hbase.TestMultiVersions): Shutting down testScanMultipleVersions(org.apache.hadoop.hbase.TestMultiVersions): org.apache.hadoop.hbase.MasterNotRunningException: Can create a proxy to master, but it is not running Not reproduced (tried once). Committed revision 1375451.
          Hide
          Nicolas Liochon added a comment -

          + Committed revision 1375454.
          As I forgot to add the new test in svn initially.

          Show
          Nicolas Liochon added a comment - + Committed revision 1375454. As I forgot to add the new test in svn initially.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12541734/6435.v14.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 11 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings).

          -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.io.encoding.TestUpgradeFromHFileV1ToEncoding

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541734/6435.v14.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.encoding.TestUpgradeFromHFileV1ToEncoding Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2638//console This message is automatically generated.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #3247 (See https://builds.apache.org/job/HBase-TRUNK/3247/)
          HBASE-6435 Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes - addendum TestBlockReorder.java (Revision 1375454)
          HBASE-6435 Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes (Revision 1375451)

          Result = FAILURE
          nkeywal :
          Files :

          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java

          nkeywal :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ServerName.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #3247 (See https://builds.apache.org/job/HBase-TRUNK/3247/ ) HBASE-6435 Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes - addendum TestBlockReorder.java (Revision 1375454) HBASE-6435 Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes (Revision 1375451) Result = FAILURE nkeywal : Files : /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java nkeywal : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ServerName.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java
          Hide
          Nicolas Liochon added a comment -

          Résultats des tests (2 échecs / ±0)
          org.apache.hadoop.hbase.TestMultiVersions.testGetRowVersions
          org.apache.hadoop.hbase.TestMultiVersions.testScanMultipleVersions

          Hum. It's the same error as the one I had in my fist local test. But it's so unrelated, and moreover we had this error in build #3242 as well; so I think it's ok. Marking as resolved.

          Show
          Nicolas Liochon added a comment - Résultats des tests (2 échecs / ±0) org.apache.hadoop.hbase.TestMultiVersions.testGetRowVersions org.apache.hadoop.hbase.TestMultiVersions.testScanMultipleVersions Hum. It's the same error as the one I had in my fist local test. But it's so unrelated, and moreover we had this error in build #3242 as well; so I think it's ok. Marking as resolved.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #140 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/140/)
          HBASE-6435 Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes - addendum TestBlockReorder.java (Revision 1375454)
          HBASE-6435 Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes (Revision 1375451)

          Result = FAILURE
          nkeywal :
          Files :

          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java

          nkeywal :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ServerName.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #140 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/140/ ) HBASE-6435 Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes - addendum TestBlockReorder.java (Revision 1375454) HBASE-6435 Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes (Revision 1375451) Result = FAILURE nkeywal : Files : /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java nkeywal : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ServerName.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #3320 (See https://builds.apache.org/job/HBase-TRUNK/3320/)
          HBASE-6746 Impacts of HBASE-6435 vs. HDFS 2.0 trunk (Revision 1382723)

          Result = FAILURE
          nkeywal :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #3320 (See https://builds.apache.org/job/HBase-TRUNK/3320/ ) HBASE-6746 Impacts of HBASE-6435 vs. HDFS 2.0 trunk (Revision 1382723) Result = FAILURE nkeywal : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #168 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/168/)
          HBASE-6746 Impacts of HBASE-6435 vs. HDFS 2.0 trunk (Revision 1382723)

          Result = FAILURE
          nkeywal :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #168 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/168/ ) HBASE-6746 Impacts of HBASE-6435 vs. HDFS 2.0 trunk (Revision 1382723) Result = FAILURE nkeywal : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java
          Hide
          Nicolas Liochon added a comment -

          As HDFS-3701 (dataloss) is into the branch 1.1 as HDFS-3703 (helps to minimize data reads errors), I think it implies that we should target 1.1 for 0.96 as the recommended minimal version. If it's the case, we can remove this fix, as it contains a dependency on hdfs internals. If we keep it, I need to fix the filename analysis and to add "-splitting" on the directories managed. In both cases, it should be done in a separate jiras, but let's have the discussion here.

          Show
          Nicolas Liochon added a comment - As HDFS-3701 (dataloss) is into the branch 1.1 as HDFS-3703 (helps to minimize data reads errors), I think it implies that we should target 1.1 for 0.96 as the recommended minimal version. If it's the case, we can remove this fix, as it contains a dependency on hdfs internals. If we keep it, I need to fix the filename analysis and to add "-splitting" on the directories managed. In both cases, it should be done in a separate jiras, but let's have the discussion here.
          Hide
          Ted Yu added a comment -

          I think we can poll dev@hbase for minimal hadoop version requirement.
          If 1.1 passes as the minimal version, we should remove this fix.

          Show
          Ted Yu added a comment - I think we can poll dev@hbase for minimal hadoop version requirement. If 1.1 passes as the minimal version, we should remove this fix.
          Hide
          Nicolas Liochon added a comment -

          I suppose we won't want to put it as minimum, at least to ease migration. But someone considering the mttr as important would have to migrate to 1.1.

          Show
          Nicolas Liochon added a comment - I suppose we won't want to put it as minimum, at least to ease migration. But someone considering the mttr as important would have to migrate to 1.1.
          Hide
          stack added a comment -

          So not a requirement but a strong suggestion?

          Yeah, we should discuss on dev.

          Show
          stack added a comment - So not a requirement but a strong suggestion? Yeah, we should discuss on dev.
          Hide
          Nicolas Liochon added a comment -

          During the tests on the impact of waiting for the end of hdfs recoverLease, it appeared:

          • there is a bug, and somes files are not detected.
          • we have a dependency on the machine name (issue if a machine has multiple names).

          HDFS-4754 supercedes this, so, to keep things simple and limit the number of possible configuration my plan is:

          • make sure that HDFS-4754 makes it to a reasonable number of hdfs branches.
          • revert this.
          Show
          Nicolas Liochon added a comment - During the tests on the impact of waiting for the end of hdfs recoverLease, it appeared: there is a bug, and somes files are not detected. we have a dependency on the machine name (issue if a machine has multiple names). HDFS-4754 supercedes this, so, to keep things simple and limit the number of possible configuration my plan is: make sure that HDFS-4754 makes it to a reasonable number of hdfs branches. revert this.
          Hide
          stack added a comment -

          Marking closed.

          Show
          stack added a comment - Marking closed.

            People

            • Assignee:
              Nicolas Liochon
              Reporter:
              Nicolas Liochon
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development