HBase
  1. HBase
  2. HBASE-7213

Have HLog files for .META. and -ROOT- edits only

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.95.0
    • Component/s: master, regionserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      The regionserver carrying .META. / -ROOT- region will now write two WALs; the usual one w/ all edits and then a special one with a .meta. suffix into which all edits for .META. / -ROOT- region go. These files will be recovered first on server crash.

      Description

      Over on HBASE-6774, there is a discussion on separating out the edits for .META. regions from the other regions' edits w.r.t where the edits are written. This jira is to track an implementation of that.

      1. 7213-in-progress.patch
        17 kB
        Devaraj Das
      2. 7213-in-progress.2.patch
        35 kB
        Devaraj Das
      3. 7213-in-progress.2.2.patch
        36 kB
        Devaraj Das
      4. 7213-2.4.patch
        43 kB
        Devaraj Das
      5. 7213-2.6.patch
        44 kB
        Devaraj Das
      6. 7213-2.8.patch
        44 kB
        Devaraj Das
      7. 7213-2.9.patch
        45 kB
        Devaraj Das
      8. 7213-2.10.patch
        46 kB
        Devaraj Das
      9. 7213-2.11.patch
        47 kB
        Devaraj Das
      10. 7213v13.txt
        45 kB
        stack
      11. 7213-2.12.patch
        50 kB
        Devaraj Das
      12. 7213-2.14.patch
        48 kB
        Devaraj Das
      13. 7213-2.14.patch
        48 kB
        Ted Yu
      14. TEST-org.apache.hadoop.hbase.client.TestMultiParallel.xml
        8.39 MB
        Ted Yu
      15. 7213-15.patch
        45 kB
        Ted Yu
      16. 7213-2.16.patch
        48 kB
        Devaraj Das
      17. 7213-addendum.patch
        1 kB
        chunhui shen

        Issue Links

          Activity

          Hide
          ramkrishna.s.vasudevan added a comment -

          Is the idea to write it in a seperate WAL file? So that i can be picked up first?

          Show
          ramkrishna.s.vasudevan added a comment - Is the idea to write it in a seperate WAL file? So that i can be picked up first?
          Hide
          stack added a comment -

          Yes

          Show
          stack added a comment - Yes
          Hide
          Enis Soztutar added a comment -

          This can also allow us to directly deploy META without waiting for log splitting (if we have separate WAL for every META region, there is nothing to split).

          Show
          Enis Soztutar added a comment - This can also allow us to directly deploy META without waiting for log splitting (if we have separate WAL for every META region, there is nothing to split).
          Hide
          Devaraj Das added a comment -

          Apologies for not responding earlier.. But yes, the high level idea is to make the recovery of META table fast and hence reduce the downtime of META should the regionserver hosting the RS crash. By having a dedicated HLog for META, the log splitting will be really simple (maybe just simple rename operations of files; need to look at the code here)...

          I am in the middle of plumbing through the code to make this work. Will hopefully have something in a day or two.

          Show
          Devaraj Das added a comment - Apologies for not responding earlier.. But yes, the high level idea is to make the recovery of META table fast and hence reduce the downtime of META should the regionserver hosting the RS crash. By having a dedicated HLog for META, the log splitting will be really simple (maybe just simple rename operations of files; need to look at the code here)... I am in the middle of plumbing through the code to make this work. Will hopefully have something in a day or two.
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Deva
          Thanks for the update. I remeber there is a hlog created inside a hregion also which is never getting used? Can we make use of it?

          Show
          ramkrishna.s.vasudevan added a comment - @Deva Thanks for the update. I remeber there is a hlog created inside a hregion also which is never getting used? Can we make use of it?
          Hide
          Devaraj Das added a comment -

          Will take a look ramkrishna.s.vasudevan, but is there any particular reason why that should be reused? (are you referring to the field

          final HLog log;

          ).

          Show
          Devaraj Das added a comment - Will take a look ramkrishna.s.vasudevan , but is there any particular reason why that should be reused? (are you referring to the field final HLog log; ).
          Hide
          Devaraj Das added a comment -

          Here is a rough, currently work in progress, patch. At a high level:
          1. There is a new MetaServices class that currently provides the services for creating/getting meta hlog.
          2. OpenMetaHandler invokes the method for creating the meta hlog, and also makes sure that the opened region gets the right hlog instance.
          3. The meta hlog file is created in the same root hlog directory, but is suffixed with .META.

          I still need to plumb through the log splitting code, etc. But I thought I'll post what I have to see if there is any immediate feedback.

          Show
          Devaraj Das added a comment - Here is a rough, currently work in progress, patch. At a high level: 1. There is a new MetaServices class that currently provides the services for creating/getting meta hlog. 2. OpenMetaHandler invokes the method for creating the meta hlog, and also makes sure that the opened region gets the right hlog instance. 3. The meta hlog file is created in the same root hlog directory, but is suffixed with .META. I still need to plumb through the log splitting code, etc. But I thought I'll post what I have to see if there is any immediate feedback.
          Hide
          ramkrishna.s.vasudevan added a comment -

          Deva,
          I could not apply the code to trunk as i don have the trunk code with me now.. (will apply it later once i go home).
          Just one question i have here..
          If the META region moves newly to an RS we instantiate the HLog for META.
          If an RS that had the META once and again gets back the META in that case we should be pointing to the same HLog right?
          May be if the code does it already pardon me....because i could not apply the code i had this doubt.

          Show
          ramkrishna.s.vasudevan added a comment - Deva, I could not apply the code to trunk as i don have the trunk code with me now.. (will apply it later once i go home). Just one question i have here.. If the META region moves newly to an RS we instantiate the HLog for META. If an RS that had the META once and again gets back the META in that case we should be pointing to the same HLog right? May be if the code does it already pardon me....because i could not apply the code i had this doubt.
          Hide
          Devaraj Das added a comment -

          If an RS that had the META once and again gets back the META in that case we should be pointing to the same HLog right?

          Yes, ramkrishna.s.vasudevan, it should be pointing to the same HLog. I am wondering though when could this happen. I can see the obvious case where a RS dies and the meta gets assigned to some other RS, but when does the former get it back?

          Show
          Devaraj Das added a comment - If an RS that had the META once and again gets back the META in that case we should be pointing to the same HLog right? Yes, ramkrishna.s.vasudevan , it should be pointing to the same HLog. I am wondering though when could this happen. I can see the obvious case where a RS dies and the meta gets assigned to some other RS, but when does the former get it back?
          Hide
          ramkrishna.s.vasudevan added a comment -

          Thanks Deva. Normally not possible. But seeing the 0.94 code atleast we dont block in issuing an assign or move command to META. !!

          Show
          ramkrishna.s.vasudevan added a comment - Thanks Deva. Normally not possible. But seeing the 0.94 code atleast we dont block in issuing an assign or move command to META. !!
          Hide
          Nicolas Liochon added a comment -

          Is it something forbidden? Because there is a (fixed) jira that says we should be able to do it (HBASE-3756).
          It's obviously critical, but I think it's requires for rolling restart, where we move all the regions from a server before stopping/starting it.
          We could also imagine balancers doing this...

          Show
          Nicolas Liochon added a comment - Is it something forbidden? Because there is a (fixed) jira that says we should be able to do it ( HBASE-3756 ). It's obviously critical, but I think it's requires for rolling restart, where we move all the regions from a server before stopping/starting it. We could also imagine balancers doing this...
          Hide
          ramkrishna.s.vasudevan added a comment -

          @N
          Yes balancer is also one thing which is doing it. just verified the code.

          Show
          ramkrishna.s.vasudevan added a comment - @N Yes balancer is also one thing which is doing it. just verified the code.
          Hide
          Devaraj Das added a comment -

          Thanks for the clarifications, ramkrishna.s.vasudevan and Nicolas Liochon. The thought of moving meta sounds scary to me though [but this patch doesn't intend to affect that part].

          Show
          Devaraj Das added a comment - Thanks for the clarifications, ramkrishna.s.vasudevan and Nicolas Liochon . The thought of moving meta sounds scary to me though [but this patch doesn't intend to affect that part] .
          Hide
          ramkrishna.s.vasudevan added a comment -

          @Deva
          Patch looks good.. We may need to check on log rolling, log splitting etc...
          One thing is if the META has just moved from one RS to another RS and the first RS crashes. We will be having a the HLog for the META also with it(because it was holding it).. So in this i think we need not split that META hlog right? Correct me if am wrong here.

          Show
          ramkrishna.s.vasudevan added a comment - @Deva Patch looks good.. We may need to check on log rolling, log splitting etc... One thing is if the META has just moved from one RS to another RS and the first RS crashes. We will be having a the HLog for the META also with it(because it was holding it).. So in this i think we need not split that META hlog right? Correct me if am wrong here.
          Hide
          Devaraj Das added a comment -

          ramkrishna.s.vasudevan, thanks for looking. Yes, this patch is not complete yet and the next iteration will have the missing functionalities. I'll also answer your questions in the next iteration.

          Show
          Devaraj Das added a comment - ramkrishna.s.vasudevan , thanks for looking. Yes, this patch is not complete yet and the next iteration will have the missing functionalities. I'll also answer your questions in the next iteration.
          Hide
          stack added a comment -

          ramkrishna.s.vasudevan On if a .META. moved from one RS to another, and first crashes, then when we go to recover, we will see it wasn't carrying .META. so we won't try and recover it (as is the case w/ any other region).

          Show
          stack added a comment - ramkrishna.s.vasudevan On if a .META. moved from one RS to another, and first crashes, then when we go to recover, we will see it wasn't carrying .META. so we won't try and recover it (as is the case w/ any other region).
          Hide
          stack added a comment -

          On the approach in the sketch patch:

          Passing a flag to instantiateHLog when its meta is a little awkward. Ripples all the ways down. Would instantiateMetaHLog and createMetaHLog in the factory be more plain about what is going on? Runs into a wall at FSHLog though though here, is there anything in the log name that would denote it a .META. WAL?

          Patch doesn't look bad so far. Not too intrusive for big payback.

          I was thinking this facility would come in as part of the multiwal work... if multiwals, make provision for an extra if carrying .meta..... but this seems like shorter path. Do you think this work will get in the way of the multiwal feature?

          Show
          stack added a comment - On the approach in the sketch patch: Passing a flag to instantiateHLog when its meta is a little awkward. Ripples all the ways down. Would instantiateMetaHLog and createMetaHLog in the factory be more plain about what is going on? Runs into a wall at FSHLog though though here, is there anything in the log name that would denote it a .META. WAL? Patch doesn't look bad so far. Not too intrusive for big payback. I was thinking this facility would come in as part of the multiwal work... if multiwals, make provision for an extra if carrying .meta..... but this seems like shorter path. Do you think this work will get in the way of the multiwal feature?
          Hide
          Devaraj Das added a comment -

          stack, yes, I am trying to keep the patch as less intrusive as I can. The fact that the multiwal is being done for META only helps towards that (for example, replication won't need to be tweaked since replication is not done for META entries, etc.). I'll check on the flag comment and see if I can introduce new methods rather than use a flag..

          Show
          Devaraj Das added a comment - stack , yes, I am trying to keep the patch as less intrusive as I can. The fact that the multiwal is being done for META only helps towards that (for example, replication won't need to be tweaked since replication is not done for META entries, etc.). I'll check on the flag comment and see if I can introduce new methods rather than use a flag..
          Hide
          ramkrishna.s.vasudevan added a comment -

          we will see it wasn't carrying .META. so we won't try and recover it

          Yes that is true...MetaServerServerShutDownHandler alone should be handling the split of META hlog. The normal SSH will handle only the other HLog.

          Show
          ramkrishna.s.vasudevan added a comment - we will see it wasn't carrying .META. so we won't try and recover it Yes that is true...MetaServerServerShutDownHandler alone should be handling the split of META hlog. The normal SSH will handle only the other HLog.
          Hide
          Ted Yu added a comment -

          I wonder what kind of data structure should be in place for the FSHLog instances (two for the region server hosting .META.) so that multiple WAL implementation would be more intuitive.
          Previously, close / deletion of HLog operates on the single instance. Now things start to get more interesting.

          Show
          Ted Yu added a comment - I wonder what kind of data structure should be in place for the FSHLog instances (two for the region server hosting .META.) so that multiple WAL implementation would be more intuitive. Previously, close / deletion of HLog operates on the single instance. Now things start to get more interesting.
          Hide
          Todd Lipcon added a comment -

          I agree it seems to make sense to lump this with the multi-WAL work. Perhaps an interface like "WALFactory" or "WALProvider", which, given a region name, gives back a WAL instance? The basic implementation would always provide the single WAL. Then, we could add the feature that returns a different WAL for META alone. More complex implementations could choose to give different tenants of a cluster separate WALs, etc.

          Show
          Todd Lipcon added a comment - I agree it seems to make sense to lump this with the multi-WAL work. Perhaps an interface like "WALFactory" or "WALProvider", which, given a region name, gives back a WAL instance? The basic implementation would always provide the single WAL. Then, we could add the feature that returns a different WAL for META alone. More complex implementations could choose to give different tenants of a cluster separate WALs, etc.
          Hide
          Ted Yu added a comment -

          We already have HLogFactory in trunk.

          to lump this with the multi-WAL work

          Can I interpret the above as saying that multi-WAL work should be done at the same time, if not earlier ?

          Since HLogFactory can hand out the unique instance for .META., it is not far from handing out different instances (for different regions) which is what HBASE-5699 tries to do.

          Show
          Ted Yu added a comment - We already have HLogFactory in trunk. to lump this with the multi-WAL work Can I interpret the above as saying that multi-WAL work should be done at the same time, if not earlier ? Since HLogFactory can hand out the unique instance for .META., it is not far from handing out different instances (for different regions) which is what HBASE-5699 tries to do.
          Hide
          Devaraj Das added a comment -

          Todd Lipcon, good points there. But I'd like to separate the META SPOF work from the full fledged multiwal work (for the full multi-wal case, we'd need to fix up things like replication, and those can be skipped from the meta-only design that this jira attempts to do).
          Ted Yu, I guess you are right. In theory, one could read the HLogFactory as the WALFactory...

          Thoughts? (i am in the process of extending the patch I previously posted to a fully functional one)

          Show
          Devaraj Das added a comment - Todd Lipcon , good points there. But I'd like to separate the META SPOF work from the full fledged multiwal work (for the full multi-wal case, we'd need to fix up things like replication, and those can be skipped from the meta-only design that this jira attempts to do). Ted Yu , I guess you are right. In theory, one could read the HLogFactory as the WALFactory... Thoughts? (i am in the process of extending the patch I previously posted to a fully functional one)
          Hide
          Todd Lipcon added a comment -

          Yea, I don't think you need to strictly sequence this after the multi-WAL work. But it would be nice to have the "end goal" in mind while doing this work. Sorry, haven't had time to look at the in-progress patch, but if there's a simple solution that works OK now, no sense blocking it for the perfect end-game solution later.

          Show
          Todd Lipcon added a comment - Yea, I don't think you need to strictly sequence this after the multi-WAL work. But it would be nice to have the "end goal" in mind while doing this work. Sorry, haven't had time to look at the in-progress patch, but if there's a simple solution that works OK now, no sense blocking it for the perfect end-game solution later.
          Hide
          Nicolas Liochon added a comment -

          To me, there are some stuff that are nice to have in multiwal but very useful here:

          • it's better to have only one table in the wal with ".meta."
          • it's would be great to be able to configure the replication factor for this wal
          • having a separate wal for meta should be the only option (it could make sense short term to have it as an option for safety, but it would be temporary)
          • there is no real split of the .meta. wal (at least conceptually: all regions ends up on the same server..). There is some room for (premature?) optimization there.
          Show
          Nicolas Liochon added a comment - To me, there are some stuff that are nice to have in multiwal but very useful here: it's better to have only one table in the wal with ".meta." it's would be great to be able to configure the replication factor for this wal having a separate wal for meta should be the only option (it could make sense short term to have it as an option for safety, but it would be temporary) there is no real split of the .meta. wal (at least conceptually: all regions ends up on the same server..). There is some room for (premature?) optimization there.
          Hide
          Devaraj Das added a comment -

          A more complete version. Testing in progress. This patch adds APIs to do with META in some classes (as opposed to a flag, as suggested by stack earlier). The logs for META region is written in a special file ending with ".META". The SSH's process method treats META hlogs with higher priority and splits them first, and as soon as the split is done, tries to assign them, and then goes on to split other regions / assignments. I have not made changes to how the ROOT region is handled, and the reason is that it is very very likely the case that the ROOT's region's edits are already persisted on disk (and in case folks feel that ROOT should be handled as well, it probably can be done in a follow up..).

          Show
          Devaraj Das added a comment - A more complete version. Testing in progress. This patch adds APIs to do with META in some classes (as opposed to a flag, as suggested by stack earlier). The logs for META region is written in a special file ending with ".META". The SSH's process method treats META hlogs with higher priority and splits them first, and as soon as the split is done, tries to assign them, and then goes on to split other regions / assignments. I have not made changes to how the ROOT region is handled, and the reason is that it is very very likely the case that the ROOT 's region's edits are already persisted on disk (and in case folks feel that ROOT should be handled as well, it probably can be done in a follow up..).
          Hide
          Ted Yu added a comment -
          +    List<ServerName> serverNames = new ArrayList<ServerName>();
          +    serverNames.add(serverName);
          +    List<Path> logDirs = getLogDirs(serverNames);
          

          The first two lines above can be moved below the if block which checks logDirs.isEmpty()

          +   * @throws IOException
          +   *             if there was an error while splitting any log file
          +   * @return cumulative size of the logfiles split
          +   * @throws IOException 
          +   */
          +  public long splitMetatLogDistributed(final List<Path> logDirs) throws IOException {
          

          Please group the @throws lines together. The method name is misspelled - I couldn't find where it is called.
          I think the above method is eclipsed by splitMetaLogDistributed() which appears later.

          +      if (isCarryingMeta() || isCarryingRoot()) {
          +        try {
          +          LOG.info("Splitting META logs for " + serverName);
          +          if (this.shouldSplitHlog) {
          +            this.services.getMasterFileSystem().splitMetaLog(serverName);
          

          The check for ROOT region above is for assigning ROOT. Suggest making the check of ROOT and .META. separate so that the logic is clearer.

          +  protected HLog instantiateMetaHLog(Path rootdir, String logName) throws IOException {
          +    return HLogFactory.createMetaHLog(this.fs.getBackingFs(), rootdir, logName, this.conf,
          +        getMETAWALActionListeners(), this.serverNameFromMasterPOV.toString());
          +  }
          

          The above method is only called by setupMetaWAL(). Can we merge it with setupMetaWAL() ?
          nit: getMETAWALActionListeners(), do we need to make all 4 letters of Meta capitalized ?

          +      LOG.info("ISALIVE: " + leases.isAlive() + " " +cacheFlusher.isAlive() + " "+ this.compactionChecker.isAlive() + " " + hlogRoller.isAlive() + " " + metaHlogRoller.isAlive()); //REMOVETHIS
          

          As the comment says: remove in next patch.

          +class MetaLogRoller extends LogRoller {
          

          Consider putting MetaLogRoller in its own class.

          For MetaServices.java, please add license and audience annotation.

          Show
          Ted Yu added a comment - + List<ServerName> serverNames = new ArrayList<ServerName>(); + serverNames.add(serverName); + List<Path> logDirs = getLogDirs(serverNames); The first two lines above can be moved below the if block which checks logDirs.isEmpty() + * @ throws IOException + * if there was an error while splitting any log file + * @ return cumulative size of the logfiles split + * @ throws IOException + */ + public long splitMetatLogDistributed( final List<Path> logDirs) throws IOException { Please group the @throws lines together. The method name is misspelled - I couldn't find where it is called. I think the above method is eclipsed by splitMetaLogDistributed() which appears later. + if (isCarryingMeta() || isCarryingRoot()) { + try { + LOG.info( "Splitting META logs for " + serverName); + if ( this .shouldSplitHlog) { + this .services.getMasterFileSystem().splitMetaLog(serverName); The check for ROOT region above is for assigning ROOT. Suggest making the check of ROOT and .META. separate so that the logic is clearer. + protected HLog instantiateMetaHLog(Path rootdir, String logName) throws IOException { + return HLogFactory.createMetaHLog( this .fs.getBackingFs(), rootdir, logName, this .conf, + getMETAWALActionListeners(), this .serverNameFromMasterPOV.toString()); + } The above method is only called by setupMetaWAL(). Can we merge it with setupMetaWAL() ? nit: getMETAWALActionListeners(), do we need to make all 4 letters of Meta capitalized ? + LOG.info( "ISALIVE: " + leases.isAlive() + " " +cacheFlusher.isAlive() + " " + this .compactionChecker.isAlive() + " " + hlogRoller.isAlive() + " " + metaHlogRoller.isAlive()); //REMOVETHIS As the comment says: remove in next patch. +class MetaLogRoller extends LogRoller { Consider putting MetaLogRoller in its own class. For MetaServices.java, please add license and audience annotation.
          Hide
          Enis Soztutar added a comment -

          and in case folks feel that ROOT should be handled as well, it probably can be done in a follow up

          I think JD has updated the patch for removing ROOT in favor of ZK.

          Show
          Enis Soztutar added a comment - and in case folks feel that ROOT should be handled as well, it probably can be done in a follow up I think JD has updated the patch for removing ROOT in favor of ZK.
          Hide
          Devaraj Das added a comment -

          Addresses most of the comments..

          One thing I didn't change is

          The check for ROOT region above is for assigning ROOT ..

          In HBASE-3171, there is work underway to remove ROOT .. So not much gain here to address things to do with how the root region is handled. I have left it as such (with a known and a very corner case that if the server holding the root's region crashes before the region's edits are persisted, the assignment of the region might be an issue).

          Manual tests seemed to indicate things are working well. I'll start the full set of unit tests now.

          Show
          Devaraj Das added a comment - Addresses most of the comments.. One thing I didn't change is The check for ROOT region above is for assigning ROOT .. In HBASE-3171 , there is work underway to remove ROOT .. So not much gain here to address things to do with how the root region is handled. I have left it as such (with a known and a very corner case that if the server holding the root's region crashes before the region's edits are persisted, the assignment of the region might be an issue). Manual tests seemed to indicate things are working well. I'll start the full set of unit tests now.
          Hide
          Devaraj Das added a comment -

          Trying hadoopqa.

          Show
          Devaraj Das added a comment - Trying hadoopqa.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12555829/7213-in-progress.2.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 99 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 26 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.catalog.TestMetaReaderEditor

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555829/7213-in-progress.2.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 99 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 26 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.catalog.TestMetaReaderEditor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3443//console This message is automatically generated.
          Hide
          Devaraj Das added a comment -

          TestMetaReaderEditor (timed out in the hadoopqa run) passes locally..

          Show
          Devaraj Das added a comment - TestMetaReaderEditor (timed out in the hadoopqa run) passes locally..
          Hide
          stack added a comment -

          .META. file ends in '.META' and not '.META.'? Can't it end in .log? xxxxxx.log.META looks a bit odd and I'm sure will trip up someone else trying to find WALs for whatever reason.

          How does splitMetaLog work? It is passed a server name and if we find logs in the filesystem, then we'll split its logs? What if .META. is currently assigned? The splitting will be to no avail? When would we ever log this message:

          +      LOG.info("No logs to split");
          

          Should that be .META. logs to split?

          Methods usually have a space between them rather than:

          +  }
          +  /**
          

          ... in hbase codebase boss.

          Why emit two log lines when you could have one that varies with whether meta or not:

                       LOG.info("Done splitting " + path);
          +            if (path.endsWith(HConstants.META_HLOG_FILE_EXTN)) {
          +              LOG.info("Done splitting meta region");
          +            }
          

          Should be a isMetaWAL method? to do the looking for META_LOG_FILE_EXTN... since used in a few places.

          In process, we have code for meta and root handling in the general ServerShutdownHandler. Should the root handling be up in RootServerShutdownHandler, in its process, which then calls through to the general SSH#process and ditto for the .META.? (Might mean ROOT needs to subclass Meta for now... ). We maybe should have done it before but was a bit tougher since had to do the log splitting first... could have broke up the process more ... but might be easier now these root and meta handlings are first thing done in the process method.

          Is this needed?

          -        this.services.getExecutorService().submit(this);
          +        //typecast to SSH so that we don't do the above meta log split again
          +        this.services.getExecutorService().submit((ServerShutdownHandler)this);
          

          Would make sense if you move the meta specific code to metaSSH, and ditto on ROOT?

          MetaServices Interface doesn't seem right? Its all about a WAL. What else would ever go in here? Who uses this new Interface and its methods? I don't see anything else in this patch using it. I suppose MetaLogRoller does but it is sitting beside RegionServer – could be protected or package private – yet its public in the Interface. Hmm... i see that OpenRegionHandler uses it. Can't RegionServer notice when its a meta region and do the setup etc., internally? Ugh, maybe not looking at this OpenRegionHandler....how it needs the exact WAL? What if you passed in the regionname to getWAL? Then it internally would do the right thing. You wouldn't have to add new methods? Especially public ones?

          When we lose the .META. region, does the RS clean up its .META. log?

          I don't like extending the RegionServerServices because makes it harder mocking up an RSS... more stuff to add in.

          Should it be MetaFSHLog rather than FSLog w/ boolean args in constructor? Just wondering.

          You have this: + LOG.info("CREATED writer " + path); //REMOVETHIS

          We have to add this method?

          + public static HLog createMetaHLog(final FileSystem fs, final Path root, final String logName,
          + final Configuration conf, final List<WALActionsListener> listeners,
          + final String prefix) throws IOException

          { + return new FSHLog(fs, root, logName, HConstants.HREGION_OLDLOGDIR_NAME, + conf, listeners, false, prefix, true); + }

          Nothing in say the log name that would clue us its a .META. file? Could we preface the logName with .META. or something and then we don't have to pass flags to FSHLog? Could have static isMetaWAL method that looks at log name and can tell it .META.?

          Import this in LogSplitter but not used: +import org.apache.hadoop.fs.PathFilter;

          Is this used:

          • private static final Pattern pattern = Pattern.compile(".\\.
            d
            ");
            + private static final Pattern pattern = Pattern.compile(".\\.
            d
            (.META)*");

          Good stuff.

          Show
          stack added a comment - .META. file ends in '.META' and not '.META.'? Can't it end in .log? xxxxxx.log.META looks a bit odd and I'm sure will trip up someone else trying to find WALs for whatever reason. How does splitMetaLog work? It is passed a server name and if we find logs in the filesystem, then we'll split its logs? What if .META. is currently assigned? The splitting will be to no avail? When would we ever log this message: + LOG.info( "No logs to split" ); Should that be .META. logs to split? Methods usually have a space between them rather than: + } + /** ... in hbase codebase boss. Why emit two log lines when you could have one that varies with whether meta or not: LOG.info( "Done splitting " + path); + if (path.endsWith(HConstants.META_HLOG_FILE_EXTN)) { + LOG.info( "Done splitting meta region" ); + } Should be a isMetaWAL method? to do the looking for META_LOG_FILE_EXTN... since used in a few places. In process, we have code for meta and root handling in the general ServerShutdownHandler. Should the root handling be up in RootServerShutdownHandler, in its process, which then calls through to the general SSH#process and ditto for the .META.? (Might mean ROOT needs to subclass Meta for now... ). We maybe should have done it before but was a bit tougher since had to do the log splitting first... could have broke up the process more ... but might be easier now these root and meta handlings are first thing done in the process method. Is this needed? - this .services.getExecutorService().submit( this ); + //typecast to SSH so that we don't do the above meta log split again + this .services.getExecutorService().submit((ServerShutdownHandler) this ); Would make sense if you move the meta specific code to metaSSH, and ditto on ROOT? MetaServices Interface doesn't seem right? Its all about a WAL. What else would ever go in here? Who uses this new Interface and its methods? I don't see anything else in this patch using it. I suppose MetaLogRoller does but it is sitting beside RegionServer – could be protected or package private – yet its public in the Interface. Hmm... i see that OpenRegionHandler uses it. Can't RegionServer notice when its a meta region and do the setup etc., internally? Ugh, maybe not looking at this OpenRegionHandler....how it needs the exact WAL? What if you passed in the regionname to getWAL? Then it internally would do the right thing. You wouldn't have to add new methods? Especially public ones? When we lose the .META. region, does the RS clean up its .META. log? I don't like extending the RegionServerServices because makes it harder mocking up an RSS... more stuff to add in. Should it be MetaFSHLog rather than FSLog w/ boolean args in constructor? Just wondering. You have this: + LOG.info("CREATED writer " + path); //REMOVETHIS We have to add this method? + public static HLog createMetaHLog(final FileSystem fs, final Path root, final String logName, + final Configuration conf, final List<WALActionsListener> listeners, + final String prefix) throws IOException { + return new FSHLog(fs, root, logName, HConstants.HREGION_OLDLOGDIR_NAME, + conf, listeners, false, prefix, true); + } Nothing in say the log name that would clue us its a .META. file? Could we preface the logName with .META. or something and then we don't have to pass flags to FSHLog? Could have static isMetaWAL method that looks at log name and can tell it .META.? Import this in LogSplitter but not used: +import org.apache.hadoop.fs.PathFilter; Is this used: private static final Pattern pattern = Pattern.compile(". \\. d "); + private static final Pattern pattern = Pattern.compile(". \\. d (.META)*"); Good stuff.
          Hide
          Devaraj Das added a comment -

          .META. file ends in '.META' and not '.META.'? Can't it end in .log? xxxxxx.log.META looks a bit odd and I'm sure will trip up someone else trying to find WALs for whatever reason.

          Hmm.. "xxxx.log.META" sounds better..

          How does splitMetaLog work? It is passed a server name and if we find logs in the filesystem, then we'll split its logs?

          Yes.. (much like how it's currently done for the non-meta logs)

          What if .META. is currently assigned? The splitting will be to no avail?

          Can you please explain when this can happen (that a RS hosting the .META. crashes, and before we do the log splitting, the .META. gets assigned to someone else)? But in general, the behavior shouldn't change in the patch from what the behavior currently is...

          When would we ever log this message:

          LOG.info("No logs to split");

          This is the same log that gets emitted for the general case (in the current codebase). Look at splitLog method in MasterFileSystem. Seems like when the logdir (.logs) doesn't exist, getLogDirs returns an empty list.

          Should that be .META. logs to split?

          Yes..

          Methods usually have a space between them rather than:

          Ack. I missed adding a space. I usually always add spaces without fail (smile)

          Why emit two log lines when you could have one that varies with whether meta or not:

          This was mostly for my debugging. I'll update this..

          Should be a isMetaWAL method? to do the looking for META_LOG_FILE_EXTN... since used in a few places.

          Yeah..

          We maybe should have done it before but was a bit tougher since had to do the log splitting first... could have broke up the process more ..

          I went that route and backtracked.. The problem is that when a RS crashes, we will probably have that RS hosting many regions (including the catalog ones). To recover we have to call process() methods of both SSH and MSSH as a pair. That seemed harder to maintain.. So I kept the existing semantics of one SSH process doing the work for both META and non-meta regions. As I have noted in a previous comment, I have explicitly not handled ROOT since the root region is going to disappear post HBASE-3171.

          Is this needed?

          Yeah, since the process() method handles both .META. and non-meta regions. My earlier comment explains why I am not very inclined to separate the handling in process(). But I'll reconsider this part and see if I can do something better.

          MetaServices Interface doesn't seem right? Its all about a WAL. You wouldn't have to add new methods? Especially public ones?

          Yeah, I shouldn't have made that a public class, etc. and maybe called it MetaWALServices or something. Would doing that alleviate this concern?

          When we lose the .META. region, does the RS clean up its .META. log?

          I need to check on this.

          I don't like extending the RegionServerServices because makes it harder mocking up an RSS... more stuff to add in.

          The new methods could just return null (and that's what I have done in the existing mock classes). But I'll consider this.

          Should it be MetaFSHLog rather than FSLog w/ boolean args in constructor? Just wondering.

          You mean another class.. Seems overkill to me..

          We have to add this method?

          public static HLog createMetaHLog(final FileSystem fs, final Path root, final String logName,

          Nothing in say the log name that would clue us its a .META. file? Could we preface the logName with .META. or something and then we don't have to pass flags to FSHLog? Could have static isMetaWAL method that looks at log name and can tell it .META.?

          The "logname" (and this is there in the current trunk) is kind of misleading. It actually signifies the directory name (and maybe I should update the code to call it logdir everywhere)... So in that sense, there is nothing indicative of "meta". Also, I'd prefer passing explicit booleans rather than string based conditionals..

          Is this used:

          private static final Pattern pattern = Pattern.compile(".
          .
          d");
          + private static final Pattern pattern = Pattern.compile(".
          .
          d(.META)*");

          Show
          Devaraj Das added a comment - .META. file ends in '.META' and not '.META.'? Can't it end in .log? xxxxxx.log.META looks a bit odd and I'm sure will trip up someone else trying to find WALs for whatever reason. Hmm.. "xxxx.log.META" sounds better.. How does splitMetaLog work? It is passed a server name and if we find logs in the filesystem, then we'll split its logs? Yes.. (much like how it's currently done for the non-meta logs) What if .META. is currently assigned? The splitting will be to no avail? Can you please explain when this can happen (that a RS hosting the .META. crashes, and before we do the log splitting, the .META. gets assigned to someone else)? But in general, the behavior shouldn't change in the patch from what the behavior currently is... When would we ever log this message: LOG.info("No logs to split"); This is the same log that gets emitted for the general case (in the current codebase). Look at splitLog method in MasterFileSystem. Seems like when the logdir (.logs) doesn't exist, getLogDirs returns an empty list. Should that be .META. logs to split? Yes.. Methods usually have a space between them rather than: Ack. I missed adding a space. I usually always add spaces without fail (smile) Why emit two log lines when you could have one that varies with whether meta or not: This was mostly for my debugging. I'll update this.. Should be a isMetaWAL method? to do the looking for META_LOG_FILE_EXTN... since used in a few places. Yeah.. We maybe should have done it before but was a bit tougher since had to do the log splitting first... could have broke up the process more .. I went that route and backtracked.. The problem is that when a RS crashes, we will probably have that RS hosting many regions (including the catalog ones). To recover we have to call process() methods of both SSH and MSSH as a pair. That seemed harder to maintain.. So I kept the existing semantics of one SSH process doing the work for both META and non-meta regions. As I have noted in a previous comment, I have explicitly not handled ROOT since the root region is going to disappear post HBASE-3171 . Is this needed? Yeah, since the process() method handles both .META. and non-meta regions. My earlier comment explains why I am not very inclined to separate the handling in process(). But I'll reconsider this part and see if I can do something better. MetaServices Interface doesn't seem right? Its all about a WAL. You wouldn't have to add new methods? Especially public ones? Yeah, I shouldn't have made that a public class, etc. and maybe called it MetaWALServices or something. Would doing that alleviate this concern? When we lose the .META. region, does the RS clean up its .META. log? I need to check on this. I don't like extending the RegionServerServices because makes it harder mocking up an RSS... more stuff to add in. The new methods could just return null (and that's what I have done in the existing mock classes). But I'll consider this. Should it be MetaFSHLog rather than FSLog w/ boolean args in constructor? Just wondering. You mean another class.. Seems overkill to me.. We have to add this method? public static HLog createMetaHLog(final FileSystem fs, final Path root, final String logName, Nothing in say the log name that would clue us its a .META. file? Could we preface the logName with .META. or something and then we don't have to pass flags to FSHLog? Could have static isMetaWAL method that looks at log name and can tell it .META.? The "logname" (and this is there in the current trunk) is kind of misleading. It actually signifies the directory name (and maybe I should update the code to call it logdir everywhere)... So in that sense, there is nothing indicative of "meta". Also, I'd prefer passing explicit booleans rather than string based conditionals.. Is this used: private static final Pattern pattern = Pattern.compile(". . d"); + private static final Pattern pattern = Pattern.compile(". . d(.META)*");
          Hide
          Devaraj Das added a comment -

          Sorry, I hit the submit button too soon. Ignore the last few lines in the last comment (starting from "Is this used.

          On the last bit:

          Is this used:

          private static final Pattern pattern = Pattern.compile(".
          .d");

          + private static final Pattern pattern = Pattern.compile(".
          .d(.META)*");

          Yes, otherwise the master deletes the .META hlog files.. (this makes it so that files are accepted with .META extensions).

          Show
          Devaraj Das added a comment - Sorry, I hit the submit button too soon. Ignore the last few lines in the last comment (starting from "Is this used . On the last bit: Is this used: private static final Pattern pattern = Pattern.compile(". .d"); + private static final Pattern pattern = Pattern.compile(". .d(.META)*"); Yes, otherwise the master deletes the .META hlog files.. (this makes it so that files are accepted with .META extensions).
          Hide
          Devaraj Das added a comment -

          When we lose the .META. region, does the RS clean up its .META. log?

          I don't think I need to do anything specific here. The existing code should handle this already.

          Show
          Devaraj Das added a comment - When we lose the .META. region, does the RS clean up its .META. log? I don't think I need to do anything specific here. The existing code should handle this already.
          Hide
          Devaraj Das added a comment -

          Okay, this patch addresses most of the comments. The thing that I have left here not-handled is the (corner) case of the RS carrying the root region dying before its edits are persisted (splitMetaLog doesn't take into account root and the root edits are not written to a separate file either). The reason is HBASE-3171 is removing the root table/region and so not much point in trying to address root here ..
          I have added an API getWAL(HRegionInfo) in RegionServerServices, and when the argument is null, the default/common WAL is returned (imagine a WAL for meta, and maybe some other regions, and a common WAL for everything else).
          I have broken up the ServerShutdownHandler.process into two parts and moved some code into MetaServerShutdownHandler.

          Show
          Devaraj Das added a comment - Okay, this patch addresses most of the comments. The thing that I have left here not-handled is the (corner) case of the RS carrying the root region dying before its edits are persisted (splitMetaLog doesn't take into account root and the root edits are not written to a separate file either). The reason is HBASE-3171 is removing the root table/region and so not much point in trying to address root here .. I have added an API getWAL(HRegionInfo) in RegionServerServices, and when the argument is null, the default/common WAL is returned (imagine a WAL for meta, and maybe some other regions, and a common WAL for everything else). I have broken up the ServerShutdownHandler.process into two parts and moved some code into MetaServerShutdownHandler.
          Hide
          stack added a comment -

          Do other logs have a '.log' suffix? You add .hlog with the below?

          + /** The META region's HLog filename extension */
          + public static final String META_HLOG_FILE_EXTN = ".meta.hlog";

          The refactoring in MasterFileSystem to make a new getLogDirs method is nice.

          The SSH changes look good. There is a hole though at the moment around ROOT handling? We need to wait on ROOT removal before this can go in, right? (The RootServerShutdownHandler would go....or should RSSH be calling MSSH when it is done w/ its process?)

          This below is given to an executor? Should have a better name than handler?

          + private UncaughtExceptionHandler handler;

          Will we always create the metahlog just because getMetaHLog was called? Even if we are NOT carrying .META.:

          + private HLog getMetaWAL() throws IOException {
          + if (this.hlogForMeta == null) {

          If .META. moves away from a RS and then comes back, we'll just use the already made meta log roller, etc.?

          Replication will skip this .META. log?

          It is intentional that MetaServices is still in the patch?

          Changing param name from logName to logDir is good stuff

          In SplitLogManager, we have added splitMetaLogDistributed. A more generic method might have been splitLogDistributed that took a file path filter instead...

          There is a define for .META. HRegionInfo in HRegionInfo#FIRST_META_REGIONINFO so you don't have to make it each time, FYI.

          Don't have to deprecate the below I'd say. This is for 0.96 the singularity and this is on a class with annotation @InterfaceAudience.Private

          /** @return the HLog */
          + @Deprecated
          public HLog getWAL();
          +
          + /** @return the HLog for a particular region. Pass null for getting the
          + * default (common) WAL */
          + public HLog getWAL(HRegionInfo regionInfo) throws IOException;

          Here....

          • return new Path(dir, prefix + "." + filenum);
            + String child = prefix + "." + filenum;
            + if (forMeta) { + child += HConstants.META_HLOG_FILE_EXTN; + }

            + return new Path(dir, child);

          the 'normal' WALs do not seem to pick up the suffix in here. Is it possible that the '.log' is appended elsewhere? Are your .META. logs getting double '.log'? Just wondering.

          Good stuff

          Show
          stack added a comment - Do other logs have a '.log' suffix? You add .hlog with the below? + /** The META region's HLog filename extension */ + public static final String META_HLOG_FILE_EXTN = ".meta.hlog"; The refactoring in MasterFileSystem to make a new getLogDirs method is nice. The SSH changes look good. There is a hole though at the moment around ROOT handling? We need to wait on ROOT removal before this can go in, right? (The RootServerShutdownHandler would go....or should RSSH be calling MSSH when it is done w/ its process?) This below is given to an executor? Should have a better name than handler? + private UncaughtExceptionHandler handler; Will we always create the metahlog just because getMetaHLog was called? Even if we are NOT carrying .META.: + private HLog getMetaWAL() throws IOException { + if (this.hlogForMeta == null) { If .META. moves away from a RS and then comes back, we'll just use the already made meta log roller, etc.? Replication will skip this .META. log? It is intentional that MetaServices is still in the patch? Changing param name from logName to logDir is good stuff In SplitLogManager, we have added splitMetaLogDistributed. A more generic method might have been splitLogDistributed that took a file path filter instead... There is a define for .META. HRegionInfo in HRegionInfo#FIRST_META_REGIONINFO so you don't have to make it each time, FYI. Don't have to deprecate the below I'd say. This is for 0.96 the singularity and this is on a class with annotation @InterfaceAudience.Private /** @return the HLog */ + @Deprecated public HLog getWAL(); + + /** @return the HLog for a particular region. Pass null for getting the + * default (common) WAL */ + public HLog getWAL(HRegionInfo regionInfo) throws IOException; Here.... return new Path(dir, prefix + "." + filenum); + String child = prefix + "." + filenum; + if (forMeta) { + child += HConstants.META_HLOG_FILE_EXTN; + } + return new Path(dir, child); the 'normal' WALs do not seem to pick up the suffix in here. Is it possible that the '.log' is appended elsewhere? Are your .META. logs getting double '.log'? Just wondering. Good stuff
          Hide
          Devaraj Das added a comment -

          Do other logs have a '.log' suffix? You add .hlog with the below?

          Currently (in the trunk codebase), the log files don't have extension like .log or something (the files look like 192.168.1.8%2C60020%2C1354695930505.1354695931168). They are in directory called ".logs/<region-server>". So I am wondering whether I should just have ".meta" as the suffix for the meta log files, and be done with it (as I had it in the previous patch). In the current patch, the meta logs have .meta.hlog as the extension. Or, would it make sense to change the log files to have ".hlog" extension as well? Please let me know..

          The SSH changes look good. There is a hole though at the moment around ROOT handling? We need to wait on ROOT removal before this can go in, right? (The RootServerShutdownHandler would go....or should RSSH be calling MSSH when it is done w/ its process?)

          It can be done either way in theory (HBASE-3171 first or this first). On the hole, yes, if this patch goes in first, and if there is a case where the same RS hosts both the root and meta regions, and that RS goes down, and the root's edits weren't already persisted, there will be a problem. But given that the root doesn't have edits other than the one row to do with .META., it is unlikely that we will run into the hole in practice. But am okay to wait till HBASE-3171 goes in (and that's a better way to line the commits, but would like to close on on the other comments).

          This below is given to an executor? Should have a better name than handler?

          + private UncaughtExceptionHandler handler;

          Will take a look and update.

          Will we always create the metahlog just because getMetaHLog was called? Even if we are NOT carrying .META.:

          No. It'll be created on the first call to getMetaWAL (and getMetaWAL gets indirectly called only when we are asked to open the meta region in OpenRegionHandler)

          If .META. moves away from a RS and then comes back, we'll just use the already made meta log roller, etc.?

          Yes.

          Replication will skip this .META. log?

          Yes. The MetaLogRoller doesn't "register" with the Replication folks when instantiated.

          It is intentional that MetaServices is still in the patch?

          Mistake.. Will update.

          In SplitLogManager, we have added splitMetaLogDistributed. A more generic method might have been splitLogDistributed that took a file path filter instead...

          Hmm.. Let me see..

          There is a define for .META. HRegionInfo in HRegionInfo#FIRST_META_REGIONINFO so you don't have to make it each time, FYI.

          I had seen this one but had forgotten to update code to use it.. will update.

          Don't have to deprecate the below I'd say. This is for 0.96 the singularity and this is on a class with annotation @InterfaceAudience.Private

          /** @return the HLog */

          + @Deprecated

          public HLog getWAL();

          Okay..

          Here....
          return new Path(dir, prefix + "." + filenum);
          + String child = prefix + "." + filenum;
          + return new Path(dir, child);
          the 'normal' WALs do not seem to pick up the suffix in here. Is it possible that the '.log' is appended elsewhere? Are your .META. logs getting double '.log'? Just wondering.

          No.. as I said before, the log files don't have the string extensions.. So what do you think - should I just remove the .hlog from the .meta.hlog extension that I did for meta hlogs? Or, leave .meta.hlog as it is and add .hlog to the non-meta hlog files, or, leave the non-meta hlog names as they are now?

          Show
          Devaraj Das added a comment - Do other logs have a '.log' suffix? You add .hlog with the below? Currently (in the trunk codebase), the log files don't have extension like .log or something (the files look like 192.168.1.8%2C60020%2C1354695930505.1354695931168). They are in directory called ".logs/<region-server>". So I am wondering whether I should just have ".meta" as the suffix for the meta log files, and be done with it (as I had it in the previous patch). In the current patch, the meta logs have .meta.hlog as the extension. Or, would it make sense to change the log files to have ".hlog" extension as well? Please let me know.. The SSH changes look good. There is a hole though at the moment around ROOT handling? We need to wait on ROOT removal before this can go in, right? (The RootServerShutdownHandler would go....or should RSSH be calling MSSH when it is done w/ its process?) It can be done either way in theory ( HBASE-3171 first or this first). On the hole, yes, if this patch goes in first, and if there is a case where the same RS hosts both the root and meta regions, and that RS goes down, and the root's edits weren't already persisted, there will be a problem. But given that the root doesn't have edits other than the one row to do with .META., it is unlikely that we will run into the hole in practice. But am okay to wait till HBASE-3171 goes in (and that's a better way to line the commits, but would like to close on on the other comments). This below is given to an executor? Should have a better name than handler? + private UncaughtExceptionHandler handler; Will take a look and update. Will we always create the metahlog just because getMetaHLog was called? Even if we are NOT carrying .META.: No. It'll be created on the first call to getMetaWAL (and getMetaWAL gets indirectly called only when we are asked to open the meta region in OpenRegionHandler) If .META. moves away from a RS and then comes back, we'll just use the already made meta log roller, etc.? Yes. Replication will skip this .META. log? Yes. The MetaLogRoller doesn't "register" with the Replication folks when instantiated. It is intentional that MetaServices is still in the patch? Mistake.. Will update. In SplitLogManager, we have added splitMetaLogDistributed. A more generic method might have been splitLogDistributed that took a file path filter instead... Hmm.. Let me see.. There is a define for .META. HRegionInfo in HRegionInfo#FIRST_META_REGIONINFO so you don't have to make it each time, FYI. I had seen this one but had forgotten to update code to use it.. will update. Don't have to deprecate the below I'd say. This is for 0.96 the singularity and this is on a class with annotation @InterfaceAudience.Private /** @return the HLog */ + @Deprecated public HLog getWAL(); Okay.. Here.... return new Path(dir, prefix + "." + filenum); + String child = prefix + "." + filenum; + return new Path(dir, child); the 'normal' WALs do not seem to pick up the suffix in here. Is it possible that the '.log' is appended elsewhere? Are your .META. logs getting double '.log'? Just wondering. No.. as I said before, the log files don't have the string extensions.. So what do you think - should I just remove the .hlog from the .meta.hlog extension that I did for meta hlogs? Or, leave .meta.hlog as it is and add .hlog to the non-meta hlog files, or, leave the non-meta hlog names as they are now?
          Hide
          stack added a comment -

          So I am wondering whether I should just have ".meta" as the suffix for the meta log files, and be done with it (as I had it in the previous patch).

          Yes. The '.hlog' is extraneous (I thought all log files had .log suffix – sorry if I mislead)

          One more spin and I'd say this patch is ready to go in. Good on you DD.

          Show
          stack added a comment - So I am wondering whether I should just have ".meta" as the suffix for the meta log files, and be done with it (as I had it in the previous patch). Yes. The '.hlog' is extraneous (I thought all log files had .log suffix – sorry if I mislead) One more spin and I'd say this patch is ready to go in. Good on you DD.
          Hide
          Devaraj Das added a comment -

          One more spin and I'd say this patch is ready to go in. Good on you DD.

          Sweet... Here is an updated patch with the filename extn change, and with passing PathFilter in splitLogDistributed as per your last comments.

          Show
          Devaraj Das added a comment - One more spin and I'd say this patch is ready to go in. Good on you DD. Sweet... Here is an updated patch with the filename extn change, and with passing PathFilter in splitLogDistributed as per your last comments.
          Hide
          Devaraj Das added a comment -

          I ran tests locally with the patch and they passed. I also ran some manual tests (killing region servers after making sure some edits weren't persisted, etc.).

          Show
          Devaraj Das added a comment - I ran tests locally with the patch and they passed. I also ran some manual tests (killing region servers after making sure some edits weren't persisted, etc.).
          Hide
          stack added a comment -

          This looks new (and important):

          + splitLog(serverNames, META_FILTER);
          + splitLog(serverNames, NON_META_FILTER);

          I can remove this on commit:

          /** @return the HLog */
          public HLog getWAL();

          Patch looks good to me.

          Lets wait some time on hbase-3171. If it doesn't show up soon, we'll commit this.

          Show
          stack added a comment - This looks new (and important): + splitLog(serverNames, META_FILTER); + splitLog(serverNames, NON_META_FILTER); I can remove this on commit: /** @return the HLog */ public HLog getWAL(); Patch looks good to me. Lets wait some time on hbase-3171. If it doesn't show up soon, we'll commit this.
          Hide
          stack added a comment -

          Marking critical so we don't forget it, so it goes in.

          Show
          stack added a comment - Marking critical so we don't forget it, so it goes in.
          Hide
          Devaraj Das added a comment -

          I can remove this on commit:

          Oops.. apologies for missing the removal. This patch removes the interface method.

          This looks new (and important):

          Yeah (smile). I noticed it recently..

          Lets wait some time on hbase-3171. If it doesn't show up soon, we'll commit this.

          Okay. That makes sense. Thanks folks for reviewing the patch and providing good feedback.

          Show
          Devaraj Das added a comment - I can remove this on commit: Oops.. apologies for missing the removal. This patch removes the interface method. This looks new (and important): Yeah (smile). I noticed it recently.. Lets wait some time on hbase-3171. If it doesn't show up soon, we'll commit this. Okay. That makes sense. Thanks folks for reviewing the patch and providing good feedback.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12556217/7213-2.8.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 102 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 23 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster
          org.apache.hadoop.hbase.TestDrainingServer
          org.apache.hadoop.hbase.client.TestMultiParallel
          org.apache.hadoop.hbase.regionserver.TestRSKilledWhenMasterInitializing

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12556217/7213-2.8.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 102 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 23 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster org.apache.hadoop.hbase.TestDrainingServer org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.regionserver.TestRSKilledWhenMasterInitializing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3511//console This message is automatically generated.
          Hide
          Devaraj Das added a comment -

          Folks, should we commit this. It has been nearly 1.5 months since the patch was reviewed and +1'ed. Since this patch is somewhat big, I'd prefer this be closed sooner than later.. (can I do anything to expedite the commit for this patch).

          Show
          Devaraj Das added a comment - Folks, should we commit this. It has been nearly 1.5 months since the patch was reviewed and +1'ed. Since this patch is somewhat big, I'd prefer this be closed sooner than later.. (can I do anything to expedite the commit for this patch).
          Hide
          Ted Yu added a comment -

          There're a few conflicts for patch v8:

          -rw-r--r--  1 tyu  staff  1578 Jan  9 15:24 ./hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java.rej
          -rw-r--r--  1 tyu  staff  3086 Jan  9 15:24 ./hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej
          -rw-r--r--  1 tyu  staff  1380 Jan  9 15:24 ./hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java.rej
          -rw-r--r--  1 tyu  staff  323 Jan  9 15:24 ./hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java.rej
          

          After they're fixed, let Hadoop QA run the test suite.

          Show
          Ted Yu added a comment - There're a few conflicts for patch v8: -rw-r--r-- 1 tyu staff 1578 Jan 9 15:24 ./hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java.rej -rw-r--r-- 1 tyu staff 3086 Jan 9 15:24 ./hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java.rej -rw-r--r-- 1 tyu staff 1380 Jan 9 15:24 ./hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java.rej -rw-r--r-- 1 tyu staff 323 Jan 9 15:24 ./hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java.rej After they're fixed, let Hadoop QA run the test suite.
          Hide
          Devaraj Das added a comment -

          Thanks for looking, Ted Yu. This is the rebased patch.

          Show
          Devaraj Das added a comment - Thanks for looking, Ted Yu . This is the rebased patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564079/7213-2.9.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestScannerTimeout
          org.apache.hadoop.hbase.replication.TestReplication
          org.apache.hadoop.hbase.master.TestMasterFailover
          org.apache.hadoop.hbase.regionserver.TestRSKilledWhenMasterInitializing
          org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster
          org.apache.hadoop.hbase.TestDrainingServer
          org.apache.hadoop.hbase.TestLocalHBaseCluster

          -1 core zombie tests. There are 2 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZKInternals(TestSplitTransactionOnCluster.java:738)
          at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:541)
          at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564079/7213-2.9.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.client.TestScannerTimeout org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.regionserver.TestRSKilledWhenMasterInitializing org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster org.apache.hadoop.hbase.TestDrainingServer org.apache.hadoop.hbase.TestLocalHBaseCluster -1 core zombie tests . There are 2 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZKInternals(TestSplitTransactionOnCluster.java:738) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:541) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3956//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Running tests mentioned above against latest patch:

          test3686b(org.apache.hadoop.hbase.client.TestScannerTimeout)  Time elapsed: 0.007 sec  <<< ERROR!
          java.lang.ArrayIndexOutOfBoundsException: -1
            at java.util.concurrent.CopyOnWriteArrayList.get(CopyOnWriteArrayList.java:343)
            at java.util.Collections$UnmodifiableList.get(Collections.java:1152)
            at org.apache.hadoop.hbase.HBaseTestingUtility.getRSForFirstRegionInTable(HBaseTestingUtility.java:1456)
            at org.apache.hadoop.hbase.client.TestScannerTimeout.test3686b(TestScannerTimeout.java:199)
          

          It may not be related to the patch.
          The following two may need closer look:

          testDataCorrectnessReplayingRecoveredEdits(org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster)  Time elapsed: 85.445 sec  <<< ERROR!
          org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: ExecutionException: 1 time,
            at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.processBatchCallback(HConnectionManager.java:2063)
            at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.access$900(HConnectionManager.java:1850)
            at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839)
            at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1818)
            at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:885)
            at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:695)
            at org.apache.hadoop.hbase.client.HTable.put(HTable.java:670)
            at org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster.putDataAndVerify(TestHRegionOnCluster.java:127)
            at org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster.testDataCorrectnessReplayingRecoveredEdits(TestHRegionOnCluster.java:115)
          
          testCorrectnessWhenMasterFailOver(org.apache.hadoop.hbase.regionserver.TestRSKilledWhenMasterInitializing)  Time elapsed: 23.843 sec  <<< FAILURE!
          java.lang.AssertionError
            at org.junit.Assert.fail(Assert.java:92)
            at org.junit.Assert.assertTrue(Assert.java:43)
            at org.junit.Assert.assertTrue(Assert.java:54)
            at org.apache.hadoop.hbase.regionserver.TestRSKilledWhenMasterInitializing.testCorrectnessWhenMasterFailOver(TestRSKilledWhenMasterInitializing.java:188)
          
          Show
          Ted Yu added a comment - Running tests mentioned above against latest patch: test3686b(org.apache.hadoop.hbase.client.TestScannerTimeout) Time elapsed: 0.007 sec <<< ERROR! java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.concurrent.CopyOnWriteArrayList.get(CopyOnWriteArrayList.java:343) at java.util.Collections$UnmodifiableList.get(Collections.java:1152) at org.apache.hadoop.hbase.HBaseTestingUtility.getRSForFirstRegionInTable(HBaseTestingUtility.java:1456) at org.apache.hadoop.hbase.client.TestScannerTimeout.test3686b(TestScannerTimeout.java:199) It may not be related to the patch. The following two may need closer look: testDataCorrectnessReplayingRecoveredEdits(org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster) Time elapsed: 85.445 sec <<< ERROR! org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: ExecutionException: 1 time, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .processBatchCallback(HConnectionManager.java:2063) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .access$900(HConnectionManager.java:1850) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1818) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:885) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:695) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:670) at org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster.putDataAndVerify(TestHRegionOnCluster.java:127) at org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster.testDataCorrectnessReplayingRecoveredEdits(TestHRegionOnCluster.java:115) testCorrectnessWhenMasterFailOver(org.apache.hadoop.hbase.regionserver.TestRSKilledWhenMasterInitializing) Time elapsed: 23.843 sec <<< FAILURE! java.lang.AssertionError at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.hadoop.hbase.regionserver.TestRSKilledWhenMasterInitializing.testCorrectnessWhenMasterFailOver(TestRSKilledWhenMasterInitializing.java:188)
          Hide
          Devaraj Das added a comment -

          Rebased (again). Also I fixed a bug in HMaster.java. Some of the unit test failures were legit and were caused by the bug.

          Show
          Devaraj Das added a comment - Rebased (again). Also I fixed a bug in HMaster.java. Some of the unit test failures were legit and were caused by the bug.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564237/7213-2.10.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestScannerTimeout
          org.apache.hadoop.hbase.client.TestMultiParallel
          org.apache.hadoop.hbase.TestLocalHBaseCluster

          -1 core zombie tests. There are 3 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZKInternals(TestSplitTransactionOnCluster.java:738)
          at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:541)
          at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564237/7213-2.10.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.client.TestScannerTimeout org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.TestLocalHBaseCluster -1 core zombie tests . There are 3 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZKInternals(TestSplitTransactionOnCluster.java:738) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:541) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3965//console This message is automatically generated.
          Hide
          stack added a comment -

          I was trying this Devaraj Das and if fails to apply to trunk. I tried to get it in but there is a big difference in this file: hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java.rej Mind taking look see sir?

          Looking at the patch again, you might move the define that is out in HConstants local to the wal package:

          public static final String META_HLOG_FILE_EXTN = ".meta";

          Put it in HLog.java Interface?

          In MasterFileSystem, should a few of the public methods have some javadoc or comment? For instance, I see splitMetaLog twice but one takes a ServerName and another takes a list of ServerNames but their bodies do different things. It is a little confusing.

          Else looks good. Lets get it in quick. Elsewhere fellas are talking about removing MetaServerShutdown handler so would require yet another patch moving stuff around. Good stuff.

          Show
          stack added a comment - I was trying this Devaraj Das and if fails to apply to trunk. I tried to get it in but there is a big difference in this file: hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java.rej Mind taking look see sir? Looking at the patch again, you might move the define that is out in HConstants local to the wal package: public static final String META_HLOG_FILE_EXTN = ".meta"; Put it in HLog.java Interface? In MasterFileSystem, should a few of the public methods have some javadoc or comment? For instance, I see splitMetaLog twice but one takes a ServerName and another takes a list of ServerNames but their bodies do different things. It is a little confusing. Else looks good. Lets get it in quick. Elsewhere fellas are talking about removing MetaServerShutdown handler so would require yet another patch moving stuff around. Good stuff.
          Hide
          Devaraj Das added a comment -

          Hopefully my last rebase

          Show
          Devaraj Das added a comment - Hopefully my last rebase
          Hide
          Devaraj Das added a comment -

          Sorry, forgot to mention that I took into consideration the last 2-3 comments from stack in the last patch (and in the process, removed one unnecessary splitMetaLog).

          Show
          Devaraj Das added a comment - Sorry, forgot to mention that I took into consideration the last 2-3 comments from stack in the last patch (and in the process, removed one unnecessary splitMetaLog).
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564369/7213-2.11.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestSplitTransaction

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564369/7213-2.11.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransaction Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3979//console This message is automatically generated.
          Hide
          stack added a comment -

          Committed to trunk. Thanks for the patch DD.

          Show
          stack added a comment - Committed to trunk. Thanks for the patch DD.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #3730 (See https://builds.apache.org/job/HBase-TRUNK/3730/)
          HBASE-7213 Have HLog files for .META. edits only (Revision 1431935)

          Result = FAILURE
          stack :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #3730 (See https://builds.apache.org/job/HBase-TRUNK/3730/ ) HBASE-7213 Have HLog files for .META. edits only (Revision 1431935) Result = FAILURE stack : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          Ted Yu added a comment -

          The test failure in TestSplitTransaction of QA run prevented large tests from running.

          I guess the failed tests (TestWALReplay, TestScannerTimeout) in trunk build #3730 were related to this patch.

          Show
          Ted Yu added a comment - The test failure in TestSplitTransaction of QA run prevented large tests from running. I guess the failed tests (TestWALReplay, TestScannerTimeout) in trunk build #3730 were related to this patch.
          Hide
          Devaraj Das added a comment -

          I will look at the test failures shortly.

          Show
          Devaraj Das added a comment - I will look at the test failures shortly.
          Hide
          stack added a comment -

          Ted Yu Thanks Ted. Let me back out the patch till we get green light from DD.

          Show
          stack added a comment - Ted Yu Thanks Ted. Let me back out the patch till we get green light from DD.
          Hide
          stack added a comment -

          Or, I'll let it for a few hours... if you get a chance to look see and make an addendum, that'd be sweet Devaraj Das

          Show
          stack added a comment - Or, I'll let it for a few hours... if you get a chance to look see and make an addendum, that'd be sweet Devaraj Das
          Hide
          stack added a comment -

          Here is what I committed last night. I fixed a long line IIRC. Putting this up in case we have to rollback.

          Show
          stack added a comment - Here is what I committed last night. I fixed a long line IIRC. Putting this up in case we have to rollback.
          Hide
          Ted Yu added a comment -

          TestCatalogTrackerOnCluster failed in trunk builds #3730 and #3731. It fails locally as well.
          I saw the following in test output:

          2013-01-11 10:42:33,443 ERROR [Shutdown of org.apache.hadoop.hbase.fs.HFileSystem@2dc4df0b] hdfs.DFSClient(416): Failed to close file /user/tyu/hbase/.logs/10.10.8.161,51511,1357929747103/10.10.8.161%2C51511%2C1357929747103.1357929752224.meta
          org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/tyu/hbase/.logs/10.10.8.161,51511,1357929747103/10.10.8.161%2C51511%2C1357929747103.1357929752224.meta File does not exist. [Lease.  Holder: DFSClient_hb_rs_10.10.8.161,51511,1357929747103_891897357_107, pendingcreates: 1]
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1720)
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1711)
            at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1619)
            at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:729)
            at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:601)
            at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
            at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
            at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:415)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
            at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
          

          Will attach test output.

          Show
          Ted Yu added a comment - TestCatalogTrackerOnCluster failed in trunk builds #3730 and #3731. It fails locally as well. I saw the following in test output: 2013-01-11 10:42:33,443 ERROR [Shutdown of org.apache.hadoop.hbase.fs.HFileSystem@2dc4df0b] hdfs.DFSClient(416): Failed to close file /user/tyu/hbase/.logs/10.10.8.161,51511,1357929747103/10.10.8.161%2C51511%2C1357929747103.1357929752224.meta org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/tyu/hbase/.logs/10.10.8.161,51511,1357929747103/10.10.8.161%2C51511%2C1357929747103.1357929752224.meta File does not exist. [Lease. Holder: DFSClient_hb_rs_10.10.8.161,51511,1357929747103_891897357_107, pendingcreates: 1] at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1720) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1711) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1619) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:729) at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) Will attach test output.
          Hide
          stack added a comment -

          I reverted for now.

          Show
          stack added a comment - I reverted for now.
          Hide
          Devaraj Das added a comment -

          Yeah the revert makes sense. Things may have changed in the codebase in the last couple of weeks leading to some test failures. I'm investigating more deeply now.

          Show
          Devaraj Das added a comment - Yeah the revert makes sense. Things may have changed in the codebase in the last couple of weeks leading to some test failures. I'm investigating more deeply now.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #3733 (See https://builds.apache.org/job/HBase-TRUNK/3733/)
          HBASE-7213 Have HLog files for .META. edits only; REVERT (Revision 1432234)

          Result = FAILURE
          stack :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #3733 (See https://builds.apache.org/job/HBase-TRUNK/3733/ ) HBASE-7213 Have HLog files for .META. edits only; REVERT (Revision 1432234) Result = FAILURE stack : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #344 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/344/)
          HBASE-7213 Have HLog files for .META. edits only; REVERT (Revision 1432234)

          Result = FAILURE
          stack :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #344 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/344/ ) HBASE-7213 Have HLog files for .META. edits only; REVERT (Revision 1432234) Result = FAILURE stack : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          Devaraj Das added a comment -

          Okay this patch fixes the issue discovered by the tests. Specifically, in the previous iteration, I missed the closeWAL for the meta hlog when a regionserver is shutdown gracefully. I also needed to add some checks around deletion of the log directory in FSHLog.closeAndDelete method. I manually verified that the failing tests in https://builds.apache.org/job/HBase-TRUNK/3732/testReport/ passes with this patch.

          This patch also fixes the 100+ line length problem.

          Show
          Devaraj Das added a comment - Okay this patch fixes the issue discovered by the tests. Specifically, in the previous iteration, I missed the closeWAL for the meta hlog when a regionserver is shutdown gracefully. I also needed to add some checks around deletion of the log directory in FSHLog.closeAndDelete method. I manually verified that the failing tests in https://builds.apache.org/job/HBase-TRUNK/3732/testReport/ passes with this patch. This patch also fixes the 100+ line length problem.
          Hide
          Devaraj Das added a comment -

          Let's try hudson.

          Show
          Devaraj Das added a comment - Let's try hudson.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564534/7213-2.12.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.replication.TestReplicationWithCompression
          org.apache.hadoop.hbase.regionserver.wal.TestWALReplay
          org.apache.hadoop.hbase.TestLocalHBaseCluster

          -1 core zombie tests. There are 2 zombie test(s): at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564534/7213-2.12.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplicationWithCompression org.apache.hadoop.hbase.regionserver.wal.TestWALReplay org.apache.hadoop.hbase.TestLocalHBaseCluster -1 core zombie tests . There are 2 zombie test(s): at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3988//console This message is automatically generated.
          Hide
          Ted Yu added a comment -
          +  private static void closeWAL(final boolean delete, HLog hlog) {
               try {
          -      if (this.hlog != null) {
          +      if (hlog != null) {
          

          The above code deals with hlog parameter which happens to eclipse this.hlog
          It would be better to distinguish the parameter more clearly from member variable.

          Currently HRegionServer has knowledge about the WAL for .META. and normal WAL. We can revisit this when multi-WAL is to be supported.

          Show
          Ted Yu added a comment - + private static void closeWAL( final boolean delete, HLog hlog) { try { - if ( this .hlog != null ) { + if (hlog != null ) { The above code deals with hlog parameter which happens to eclipse this.hlog It would be better to distinguish the parameter more clearly from member variable. Currently HRegionServer has knowledge about the WAL for .META. and normal WAL. We can revisit this when multi-WAL is to be supported.
          Hide
          Devaraj Das added a comment -

          Changed the part where I close the meta hlog, to make it less intrusive. In the previous patch, it seemed like I was putting state in FSHLog.closeAndDelete where it needed to keep track of directory being empty before deleting. Didn't like it so much. There is an API 'close' in FSHLog, I am now calling that from HRegionServer (left in a comment there about the reason).

          I have also attached the diff file that does a diff between the earlier committed patch (7213-2.11.patch) and the current one, for quick reference on what changed.

          Show
          Devaraj Das added a comment - Changed the part where I close the meta hlog, to make it less intrusive. In the previous patch, it seemed like I was putting state in FSHLog.closeAndDelete where it needed to keep track of directory being empty before deleting. Didn't like it so much. There is an API 'close' in FSHLog, I am now calling that from HRegionServer (left in a comment there about the reason). I have also attached the diff file that does a diff between the earlier committed patch (7213-2.11.patch) and the current one, for quick reference on what changed.
          Hide
          Devaraj Das added a comment -

          I have also attached the diff file that does a diff between the earlier committed patch (7213-2.11.patch) and the current one, for quick reference on what changed.

          Deleted the file since it might confuse hadoopqa.. (but yeah the difference between my earlier patch a version of which got committed and this new one is only in the HRegionServer.closeWAL, and a line length fix)

          Show
          Devaraj Das added a comment - I have also attached the diff file that does a diff between the earlier committed patch (7213-2.11.patch) and the current one, for quick reference on what changed. Deleted the file since it might confuse hadoopqa.. (but yeah the difference between my earlier patch a version of which got committed and this new one is only in the HRegionServer.closeWAL, and a line length fix)
          Hide
          Ted Yu added a comment -

          Patch v14 looks good.
          I ran the following tests locally and they passed:

            684  mt -Dtest=TestCatalogTrackerOnCluster#testBadOriginalRootLocation 
            685  mt -Dtest=TestUpgradeFromHFileV1ToEncoding#testUpgrade 
            686  mt -Dtest=TestRestartCluster
          

          There was one test failure which may not be related to the patch:

          testReplayEditsAfterRegionMovedWithMultiCF(org.apache.hadoop.hbase.regionserver.wal.TestWALReplay)  Time elapsed: 41.99 sec  <<< ERROR!
          org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
          Sat Jan 12 11:25:58 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, org.apache.hadoop.hbase.RegionMovedException: Region moved to: hostname=10.120.104.184 port=56941.
          Sat Jan 12 11:25:59 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.io.IOException: Call to /10.120.104.184:56941 failed on local exception: java.io.EOFException
          Sat Jan 12 11:26:00 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused
          Sat Jan 12 11:26:01 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: /10.120.104.184:56941
          Sat Jan 12 11:26:03 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused
          Sat Jan 12 11:26:05 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused
          Sat Jan 12 11:26:09 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused
          Sat Jan 12 11:26:13 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused
          Sat Jan 12 11:26:22 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused
          Sat Jan 12 11:26:38 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused
          
            at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:186)
            at org.apache.hadoop.hbase.client.HTable.get(HTable.java:561)
            at org.apache.hadoop.hbase.regionserver.wal.TestWALReplay.testReplayEditsAfterRegionMovedWithMultiCF(TestWALReplay.java:198)
          
          Show
          Ted Yu added a comment - Patch v14 looks good. I ran the following tests locally and they passed: 684 mt -Dtest=TestCatalogTrackerOnCluster#testBadOriginalRootLocation 685 mt -Dtest=TestUpgradeFromHFileV1ToEncoding#testUpgrade 686 mt -Dtest=TestRestartCluster There was one test failure which may not be related to the patch: testReplayEditsAfterRegionMovedWithMultiCF(org.apache.hadoop.hbase.regionserver.wal.TestWALReplay) Time elapsed: 41.99 sec <<< ERROR! org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Sat Jan 12 11:25:58 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, org.apache.hadoop.hbase.RegionMovedException: Region moved to: hostname=10.120.104.184 port=56941. Sat Jan 12 11:25:59 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.io.IOException: Call to /10.120.104.184:56941 failed on local exception: java.io.EOFException Sat Jan 12 11:26:00 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused Sat Jan 12 11:26:01 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: /10.120.104.184:56941 Sat Jan 12 11:26:03 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused Sat Jan 12 11:26:05 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused Sat Jan 12 11:26:09 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused Sat Jan 12 11:26:13 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused Sat Jan 12 11:26:22 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused Sat Jan 12 11:26:38 PST 2013, org.apache.hadoop.hbase.client.HTable$3@3c993730, java.net.ConnectException: Connection refused at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:186) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:561) at org.apache.hadoop.hbase.regionserver.wal.TestWALReplay.testReplayEditsAfterRegionMovedWithMultiCF(TestWALReplay.java:198)
          Hide
          Ted Yu added a comment -
          +      if (this.hlogForMeta != null) {
          ...
          +        this.hlogForMeta.close();
          +      }
                 if (this.hlog != null) {
          

          For the first if statement above, should we check whether this.hlog is null ? If this.hlog is null, we should be calling this.hlogForMeta.closeAndDelete().

          Since there is constraint on order of closing multiple WALs, it would better if this logic is handled by FSHLog - can be done in another JIRA.

          Show
          Ted Yu added a comment - + if ( this .hlogForMeta != null ) { ... + this .hlogForMeta.close(); + } if ( this .hlog != null ) { For the first if statement above, should we check whether this.hlog is null ? If this.hlog is null, we should be calling this.hlogForMeta.closeAndDelete(). Since there is constraint on order of closing multiple WALs, it would better if this logic is handled by FSHLog - can be done in another JIRA.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564585/7213-2.14.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestMultiParallel
          org.apache.hadoop.hbase.replication.TestReplicationWithCompression
          org.apache.hadoop.hbase.master.TestDistributedLogSplitting
          org.apache.hadoop.hbase.TestLocalHBaseCluster

          -1 core zombie tests. There are 2 zombie test(s): at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564585/7213-2.14.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.replication.TestReplicationWithCompression org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.TestLocalHBaseCluster -1 core zombie tests . There are 2 zombie test(s): at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3989//console This message is automatically generated.
          Hide
          Devaraj Das added a comment -

          For the first if statement above, should we check whether this.hlog is null ? If this.hlog is null, we should be calling this.hlogForMeta.closeAndDelete().

          I don't think we need to worry about it. We will not have a case where hlogForMeta is non-null and hlog is null (hlog is initialized before hlogForMeta).

          Show
          Devaraj Das added a comment - For the first if statement above, should we check whether this.hlog is null ? If this.hlog is null, we should be calling this.hlogForMeta.closeAndDelete(). I don't think we need to worry about it. We will not have a case where hlogForMeta is non-null and hlog is null ( hlog is initialized before hlogForMeta ).
          Hide
          Devaraj Das added a comment -

          Verified that the tests pass locally.

          Show
          Devaraj Das added a comment - Verified that the tests pass locally.
          Hide
          Ted Yu added a comment -

          Re-attaching patch v14.

          Show
          Ted Yu added a comment - Re-attaching patch v14.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564591/7213-2.14.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestScannerTimeout
          org.apache.hadoop.hbase.client.TestMultiParallel
          org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster
          org.apache.hadoop.hbase.master.TestRollingRestart
          org.apache.hadoop.hbase.TestLocalHBaseCluster

          -1 core zombie tests. There are 4 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZKInternals(TestSplitTransactionOnCluster.java:738)
          at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:541)
          at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564591/7213-2.14.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.client.TestScannerTimeout org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster org.apache.hadoop.hbase.master.TestRollingRestart org.apache.hadoop.hbase.TestLocalHBaseCluster -1 core zombie tests . There are 4 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZKInternals(TestSplitTransactionOnCluster.java:738) at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testSplitBeforeSettingSplittingInZK(TestSplitTransactionOnCluster.java:541) at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:220) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3991//console This message is automatically generated.
          Hide
          chunhui shen added a comment -

          Run the failed test TestMultiParallel on local PC for several times.

          Without patch,passed 15 times;
          With 7213-2.14.patch,4 failed of total 10;

          Following is the failed tests' stacktrace:

          org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 37 actions: ExecutionException: 37 times,
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process$BatchErrors.makeException(HConnectionManager.java:2106)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process$BatchErrors.rethrowIfAny(HConnectionManager.java:2089)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.processBatchCallback(HConnectionManager.java:2071)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.access$900(HConnectionManager.java:1850)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839)
                  at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:601)
                  at org.apache.hadoop.hbase.client.TestMultiParallel.testBatchWithPut(TestMultiParallel.java:303)
           
          org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 321 actions:
          ExecutionException: 321 times,
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process$BatchErrors.makeException(HConnectionManager.java:2106)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process$BatchErrors.rethrowIfAny(HConnectionManager.java:2089)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.processBatchCallback(HConnectionManager.java:2071)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.access$900(HConnectionManager.java:1850)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839)
                  at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:601)
                  at org.apache.hadoop.hbase.client.TestMultiParallel.testBatchWithDelete(TestMultiParallel.java:331)
           
          org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 321 actions:
          ExecutionException: 321 times,
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process$BatchErrors.makeException(HConnectionManager.java:2106)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process$BatchErrors.rethrowIfAny(HConnectionManager.java:2089)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.processBatchCallback(HConnectionManager.java:2071)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.access$900(HConnectionManager.java:1850)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839)
                  at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:601)
                  at org.apache.hadoop.hbase.client.TestMultiParallel.testHTableDeleteWithList(TestMultiParallel.java:360)
           
          org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 100 actions:
          ExecutionException: 100 times,
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process$BatchErrors.makeException(HConnectionManager.java:2106)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process$BatchErrors.rethrowIfAny(HConnectionManager.java:2089)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.processBatchCallback(HConnectionManager.java:2071)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.access$900(HConnectionManager.java:1850)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839)
                  at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:601)
                  at org.apache.hadoop.hbase.client.TestMultiParallel.testBatchWithManyColsInOneRowGetAndPut(TestMultiParallel.java:394)
          
          org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 358 actions:
          ExecutionException: 358 times,
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process$BatchErrors.makeException(HConnectionManager.java:2106)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process$BatchErrors.rethrowIfAny(HConnectionManager.java:2089)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.processBatchCallback(HConnectionManager.java:2071)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.access$900(HConnectionManager.java:1850)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839)
                  at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:601)
                  at org.apache.hadoop.hbase.client.TestMultiParallel.testBatchWithMixedActions(TestMultiParallel.java:459)
          

          FYI

          Thanks

          Show
          chunhui shen added a comment - Run the failed test TestMultiParallel on local PC for several times. Without patch,passed 15 times; With 7213-2.14.patch,4 failed of total 10; Following is the failed tests' stacktrace: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 37 actions: ExecutionException: 37 times, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process $BatchErrors.makeException(HConnectionManager.java:2106) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process $BatchErrors.rethrowIfAny(HConnectionManager.java:2089) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .processBatchCallback(HConnectionManager.java:2071) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .access$900(HConnectionManager.java:1850) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839) at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:601) at org.apache.hadoop.hbase.client.TestMultiParallel.testBatchWithPut(TestMultiParallel.java:303) org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 321 actions: ExecutionException: 321 times, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process $BatchErrors.makeException(HConnectionManager.java:2106) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process $BatchErrors.rethrowIfAny(HConnectionManager.java:2089) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .processBatchCallback(HConnectionManager.java:2071) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .access$900(HConnectionManager.java:1850) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839) at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:601) at org.apache.hadoop.hbase.client.TestMultiParallel.testBatchWithDelete(TestMultiParallel.java:331) org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 321 actions: ExecutionException: 321 times, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process $BatchErrors.makeException(HConnectionManager.java:2106) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process $BatchErrors.rethrowIfAny(HConnectionManager.java:2089) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .processBatchCallback(HConnectionManager.java:2071) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .access$900(HConnectionManager.java:1850) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839) at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:601) at org.apache.hadoop.hbase.client.TestMultiParallel.testHTableDeleteWithList(TestMultiParallel.java:360) org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 100 actions: ExecutionException: 100 times, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process $BatchErrors.makeException(HConnectionManager.java:2106) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process $BatchErrors.rethrowIfAny(HConnectionManager.java:2089) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .processBatchCallback(HConnectionManager.java:2071) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .access$900(HConnectionManager.java:1850) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839) at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:601) at org.apache.hadoop.hbase.client.TestMultiParallel.testBatchWithManyColsInOneRowGetAndPut(TestMultiParallel.java:394) org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 358 actions: ExecutionException: 358 times, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process $BatchErrors.makeException(HConnectionManager.java:2106) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process $BatchErrors.rethrowIfAny(HConnectionManager.java:2089) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .processBatchCallback(HConnectionManager.java:2071) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$ Process .access$900(HConnectionManager.java:1850) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1839) at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:601) at org.apache.hadoop.hbase.client.TestMultiParallel.testBatchWithMixedActions(TestMultiParallel.java:459) FYI Thanks
          Hide
          Devaraj Das added a comment -

          I am looking at the TestMultiParallel. I am at this point just can't see how this patch could cause the test failure..

          Show
          Devaraj Das added a comment - I am looking at the TestMultiParallel. I am at this point just can't see how this patch could cause the test failure..
          Hide
          Ted Yu added a comment -

          Test output for TestMultiParallel failure.

          Show
          Ted Yu added a comment - Test output for TestMultiParallel failure.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564621/TEST-org.apache.hadoop.hbase.client.TestMultiParallel.xml
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3993//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564621/TEST-org.apache.hadoop.hbase.client.TestMultiParallel.xml against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 5 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3993//console This message is automatically generated.
          Hide
          chunhui shen added a comment -

          From

          public HLog getWAL(HRegionInfo regionInfo) throws IOException {
              //TODO: at some point this should delegate to the HLogFactory
              //currently, we don't care about the region as much as we care about the 
              //table.. (hence checking the tablename below)
              if (regionInfo != null && 
                  Arrays.equals(regionInfo.getTableName(), HConstants.META_TABLE_NAME)) {
                return getMetaWAL();
              }
          

          what about ROOT region? we put its logs with other user regions?
          In MetaServerShutdownHadnler, we consider META and ROOT both are meta regions.

          Correct me if wrong

          Thanks.

          Show
          chunhui shen added a comment - From public HLog getWAL(HRegionInfo regionInfo) throws IOException { //TODO: at some point this should delegate to the HLogFactory //currently, we don't care about the region as much as we care about the //table.. (hence checking the tablename below) if (regionInfo != null && Arrays.equals(regionInfo.getTableName(), HConstants.META_TABLE_NAME)) { return getMetaWAL(); } what about ROOT region? we put its logs with other user regions? In MetaServerShutdownHadnler, we consider META and ROOT both are meta regions. Correct me if wrong Thanks.
          Hide
          chunhui shen added a comment -

          After fixing the above problem, passed TestMultiParallel for 15 times on my local PC.

          Show
          chunhui shen added a comment - After fixing the above problem, passed TestMultiParallel for 15 times on my local PC.
          Hide
          Ted Yu added a comment -

          Patch v15 incorporates Chunhui's suggestion

          Show
          Ted Yu added a comment - Patch v15 incorporates Chunhui's suggestion
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564659/7213-15.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.TestZooKeeper

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564659/7213-15.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.TestZooKeeper Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3996//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          I ran the following tests locally and they passed:

          981 mt -Dtest=TestZooKeeper,TestSplitTransactionOnCluster,TestHRegionOnCluster
          982 mt -Dtest=TestRSKilledWhenMasterInitializing

          I think patch v15 should be ready for integration.

          Show
          Ted Yu added a comment - I ran the following tests locally and they passed: 981 mt -Dtest=TestZooKeeper,TestSplitTransactionOnCluster,TestHRegionOnCluster 982 mt -Dtest=TestRSKilledWhenMasterInitializing I think patch v15 should be ready for integration.
          Hide
          Devaraj Das added a comment -

          Thanks a lot, chunhui shen for spotting the problem. Thanks, Ted Yu for validating the fix.

          I am uploading a new patch that takes into account the latest change in HRegionServer.getWAL(HRegionInfo) as suggested by chunhui shen, and also adds/updates some comments in parts of the patch.

          Show
          Devaraj Das added a comment - Thanks a lot, chunhui shen for spotting the problem. Thanks, Ted Yu for validating the fix. I am uploading a new patch that takes into account the latest change in HRegionServer.getWAL(HRegionInfo) as suggested by chunhui shen , and also adds/updates some comments in parts of the patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564753/7213-2.16.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.TestLocalHBaseCluster

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564753/7213-2.16.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.TestLocalHBaseCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4004//console This message is automatically generated.
          Hide
          Devaraj Das added a comment -

          I ran TestLocalHBaseCluster many times locally and all runs passed..

          Show
          Devaraj Das added a comment - I ran TestLocalHBaseCluster many times locally and all runs passed..
          Hide
          Ted Yu added a comment -

          Integrated v16 to trunk.

          Thanks for the patch, Devaraj.

          Thanks for the review, Chunhui.

          Show
          Ted Yu added a comment - Integrated v16 to trunk. Thanks for the patch, Devaraj. Thanks for the review, Chunhui.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #3744 (See https://builds.apache.org/job/HBase-TRUNK/3744/)
          HBASE-7213 Have HLog files for .META. and ROOT edits only (Devaraj Das) (Revision 1433152)

          Result = FAILURE
          tedyu :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #3744 (See https://builds.apache.org/job/HBase-TRUNK/3744/ ) HBASE-7213 Have HLog files for .META. and ROOT edits only (Devaraj Das) (Revision 1433152) Result = FAILURE tedyu : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          Ted Yu added a comment -

          I ran TestAssignmentManagerOnCluster, TestDistributedLogSplitting and TestHBaseFsck using both jdk 1.6 and 1.7 locally.
          They passed.

          The tests failed in a previous build, #3743, too.

          Show
          Ted Yu added a comment - I ran TestAssignmentManagerOnCluster, TestDistributedLogSplitting and TestHBaseFsck using both jdk 1.6 and 1.7 locally. They passed. The tests failed in a previous build, #3743, too.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #347 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/347/)
          HBASE-7213 Have HLog files for .META. and ROOT edits only (Devaraj Das) (Revision 1433152)

          Result = FAILURE
          tedyu :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #347 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/347/ ) HBASE-7213 Have HLog files for .META. and ROOT edits only (Devaraj Das) (Revision 1433152) Result = FAILURE tedyu : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          chunhui shen added a comment -

          Ted found the failed test in https://builds.apache.org/job/HBase-TRUNK/3751/testReport/org.apache.hadoop.hbase.regionserver/TestHRegionOnCluster/testDataCorrectnessReplayingRecoveredEdits/

          I think the root cause is :

          2013-01-15 18:55:11,132 INFO  [RegionServer:1;minerva.apache.org,57481,1358276108448] regionserver.HRegionServer(1658): STOPPED: Meta HLog roller thread is no longer alive -- stop
          

          From the code:

          this.metaHLogRoller = new MetaLogRoller(this, this);
              String n = Thread.currentThread().getName();
              Threads.setDaemonThreadRunning(this.metaHLogRoller.getThread(), 
                  n + "MetaLogRoller", uncaughtExceptionHandler);
          
          if (metaHLogRoller != null && !metaHLogRoller.isAlive()) {
                stop("Meta HLog roller thread is no longer alive -- stop");
                return false;
              }
          

          Should we consider it is possible where metaHLogRoller != null but we haven't start this thread?

          Make an addendum patch.

          Devaraj Das
          Could you confirm it?

          Show
          chunhui shen added a comment - Ted found the failed test in https://builds.apache.org/job/HBase-TRUNK/3751/testReport/org.apache.hadoop.hbase.regionserver/TestHRegionOnCluster/testDataCorrectnessReplayingRecoveredEdits/ I think the root cause is : 2013-01-15 18:55:11,132 INFO [RegionServer:1;minerva.apache.org,57481,1358276108448] regionserver.HRegionServer(1658): STOPPED: Meta HLog roller thread is no longer alive -- stop From the code: this .metaHLogRoller = new MetaLogRoller( this , this ); String n = Thread .currentThread().getName(); Threads.setDaemonThreadRunning( this .metaHLogRoller.getThread(), n + "MetaLogRoller" , uncaughtExceptionHandler); if (metaHLogRoller != null && !metaHLogRoller.isAlive()) { stop( "Meta HLog roller thread is no longer alive -- stop" ); return false ; } Should we consider it is possible where metaHLogRoller != null but we haven't start this thread? Make an addendum patch. Devaraj Das Could you confirm it?
          Hide
          Ted Yu added a comment -

          Try addendum through hadoop QA

          Show
          Ted Yu added a comment - Try addendum through hadoop QA
          Hide
          Ted Yu added a comment -

          The test does this:

                targetServer.kill();
          

          However I don't find "Simulated kill" that should have been logged by the above call.

          Show
          Ted Yu added a comment - The test does this: targetServer.kill(); However I don't find "Simulated kill" that should have been logged by the above call.
          Hide
          chunhui shen added a comment -

          Ted
          It is not caused by targetServer.kill();

          Test is failed when starting cluster.

          The regionserver main thread do the check isHealthy() in run()
          and the open-root-thread do getMetaWAL(). The two threads caused the above log

          Show
          chunhui shen added a comment - Ted It is not caused by targetServer.kill(); Test is failed when starting cluster. The regionserver main thread do the check isHealthy() in run() and the open-root-thread do getMetaWAL(). The two threads caused the above log
          Hide
          Ted Yu added a comment -

          We're on the same page.
          I meant that region server going down was not caused by call to kill().

          Show
          Ted Yu added a comment - We're on the same page. I meant that region server going down was not caused by call to kill().
          Hide
          chunhui shen added a comment - - edited

          hmm...I took a wrong understanding..._

          Show
          chunhui shen added a comment - - edited hmm...I took a wrong understanding... _
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12565065/7213-addendum.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12565065/7213-addendum.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4041//console This message is automatically generated.
          Hide
          Devaraj Das added a comment -

          Addendum looks fine. Thanks guys!

          Show
          Devaraj Das added a comment - Addendum looks fine. Thanks guys!
          Hide
          Ted Yu added a comment -

          Addendum integrated to trunk.

          Thanks Chunhui for the patch.

          Show
          Ted Yu added a comment - Addendum integrated to trunk. Thanks Chunhui for the patch.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #3754 (See https://builds.apache.org/job/HBase-TRUNK/3754/)
          HBASE-7213 Addendum tries to fix premature LogRoller exit (Chunhui) (Revision 1433830)

          Result = FAILURE
          tedyu :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #3754 (See https://builds.apache.org/job/HBase-TRUNK/3754/ ) HBASE-7213 Addendum tries to fix premature LogRoller exit (Chunhui) (Revision 1433830) Result = FAILURE tedyu : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #350 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/350/)
          HBASE-7213 Addendum tries to fix premature LogRoller exit (Chunhui) (Revision 1433830)

          Result = FAILURE
          tedyu :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #350 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/350/ ) HBASE-7213 Addendum tries to fix premature LogRoller exit (Chunhui) (Revision 1433830) Result = FAILURE tedyu : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94 #924 (See https://builds.apache.org/job/HBase-0.94/924/)
          HBASE-8081. Backport HBASE-7213 (separate hlog for meta tables) (Devaraj Das). (Revision 1461314)

          Result = ABORTED
          ddas :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          • /hbase/branches/0.94/src/main/resources/hbase-default.xml
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-0.94 #924 (See https://builds.apache.org/job/HBase-0.94/924/ ) HBASE-8081 . Backport HBASE-7213 (separate hlog for meta tables) (Devaraj Das). (Revision 1461314) Result = ABORTED ddas : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/branches/0.94/src/main/resources/hbase-default.xml /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94-security #129 (See https://builds.apache.org/job/HBase-0.94-security/129/)
          HBASE-8081. Backport HBASE-7213 (separate hlog for meta tables) (Devaraj Das). (Revision 1461314)

          Result = FAILURE
          ddas :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          • /hbase/branches/0.94/src/main/resources/hbase-default.xml
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-0.94-security #129 (See https://builds.apache.org/job/HBase-0.94-security/129/ ) HBASE-8081 . Backport HBASE-7213 (separate hlog for meta tables) (Devaraj Das). (Revision 1461314) Result = FAILURE ddas : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/branches/0.94/src/main/resources/hbase-default.xml /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94-security-on-Hadoop-23 #13 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/13/)
          HBASE-8081. Backport HBASE-7213 (separate hlog for meta tables) (Devaraj Das). (Revision 1461314)

          Result = FAILURE
          ddas :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
          • /hbase/branches/0.94/src/main/resources/hbase-default.xml
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          Hudson added a comment - Integrated in HBase-0.94-security-on-Hadoop-23 #13 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/13/ ) HBASE-8081 . Backport HBASE-7213 (separate hlog for meta tables) (Devaraj Das). (Revision 1461314) Result = FAILURE ddas : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/MetaLogRoller.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/branches/0.94/src/main/resources/hbase-default.xml /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          stack added a comment -

          Marking closed.

          Show
          stack added a comment - Marking closed.

            People

            • Assignee:
              Devaraj Das
              Reporter:
              Devaraj Das
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development