Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-7390

Add extra test cases for assignement on the region server and fix the related issues

    Details

    • Hadoop Flags:
      Reviewed

      Description

      We don't have a lot of tests on the region server itself.
      Here are some.
      Some of them are failing, feedback welcome.
      See as well the attached state diagram for the ZK nodes on assignment.

      1. 7390.v1.patch
        14 kB
        Nicolas Liochon
      2. assignment_zk_states.jpg
        99 kB
        Nicolas Liochon
      3. 7390.v2.patch
        25 kB
        Nicolas Liochon
      4. 7390.v3.patch
        31 kB
        Nicolas Liochon
      5. 7390.v4.patch
        32 kB
        Nicolas Liochon
      6. 7390.v6.patch
        49 kB
        Nicolas Liochon
      7. 7390.v7.patch
        49 kB
        Nicolas Liochon
      8. 7390.v8.patch
        62 kB
        Nicolas Liochon
      9. 7390.v9.patch
        62 kB
        Nicolas Liochon
      10. 7390.v9.patch
        62 kB
        Nicolas Liochon
      11. 7390.v10.patch
        65 kB
        Nicolas Liochon
      12. 7390.v11.patch
        70 kB
        Nicolas Liochon

        Issue Links

          Activity

          Hide
          stack stack added a comment -

          Why does master not see SPLITTING? Because it does not have a watcher? It cannot create a znode when there is a SPLITTING znode in place though. That is what we want, right? Ditto for the OPENING? Does the master not have a watcher on the OPENING znode?

          Missing license.

          Should test be called TestRegionServerNoMaster?

          Odd testCloseByMasterWithoutZNode fails. I can think no reason why it should fail. Ditto testMultipleOpen

          Nice tests Nicolas Liochon

          Show
          stack stack added a comment - Why does master not see SPLITTING? Because it does not have a watcher? It cannot create a znode when there is a SPLITTING znode in place though. That is what we want, right? Ditto for the OPENING? Does the master not have a watcher on the OPENING znode? Missing license. Should test be called TestRegionServerNoMaster? Odd testCloseByMasterWithoutZNode fails. I can think no reason why it should fail. Ditto testMultipleOpen Nice tests Nicolas Liochon
          Hide
          nkeywal Nicolas Liochon added a comment -

          This code there is very very smart: if there is a open in progress, it changes the internal state to close, then raises an exception through the call to checkIfRegionInTransition. As we changed the state, we will have, if we were currently opening, a message saying that we were trying to close a region already closing.

          As well, it really means the other stuff going on is supposed to catch the state change?

            public CloseRegionResponse closeRegion(final RpcController controller,
                final CloseRegionRequest request) throws ServiceException {
          //..
                Boolean openAction = regionsInTransitionInRS.get(encodedName);
                if (openAction != null) {
                  if (openAction.booleanValue()) {
                    regionsInTransitionInRS.replace(encodedName, openAction, Boolean.FALSE);
                  }
                  checkIfRegionInTransition(encodedName, CLOSE);
                }
                     
                requestCount.increment();
                LOG.info("Received close region: " + region.getRegionNameAsString() +
                  ". Version of ZK closing node:" + versionOfClosingNode +
                  ". Destination server:" + sn);
                HRegionInfo regionInfo = region.getRegionInfo();
                checkIfRegionInTransition(encodedName, CLOSE);
                boolean closed = closeRegion(
                  regionInfo, false, zk, versionOfClosingNode, sn);
                CloseRegionResponse.Builder builder =
                  CloseRegionResponse.newBuilder().setClosed(closed);
                return builder.build();
              } catch (IOException ie) {
                throw new ServiceException(ie);
              }
            }
          
          Show
          nkeywal Nicolas Liochon added a comment - This code there is very very smart: if there is a open in progress, it changes the internal state to close, then raises an exception through the call to checkIfRegionInTransition. As we changed the state, we will have, if we were currently opening, a message saying that we were trying to close a region already closing. As well, it really means the other stuff going on is supposed to catch the state change? public CloseRegionResponse closeRegion( final RpcController controller, final CloseRegionRequest request) throws ServiceException { //.. Boolean openAction = regionsInTransitionInRS.get(encodedName); if (openAction != null ) { if (openAction.booleanValue()) { regionsInTransitionInRS.replace(encodedName, openAction, Boolean .FALSE); } checkIfRegionInTransition(encodedName, CLOSE); } requestCount.increment(); LOG.info( "Received close region: " + region.getRegionNameAsString() + ". Version of ZK closing node:" + versionOfClosingNode + ". Destination server:" + sn); HRegionInfo regionInfo = region.getRegionInfo(); checkIfRegionInTransition(encodedName, CLOSE); boolean closed = closeRegion( regionInfo, false , zk, versionOfClosingNode, sn); CloseRegionResponse.Builder builder = CloseRegionResponse.newBuilder().setClosed(closed); return builder.build(); } catch (IOException ie) { throw new ServiceException(ie); } }
          Hide
          nkeywal Nicolas Liochon added a comment -

          Why does master not see SPLITTING? Because it does not have a watcher? It cannot create a znode when there is a SPLITTING znode in place though. That is what we want, right? Ditto for the OPENING? Does the master not have a watcher on the OPENING znode?

          I need to explore the splitting case, but there are no real reason imho (at least not a ZK reason).

          testMultipleOpen

          I'm on this one (see comment above), it's not easy. The fix is trivial. What it is revealing is not.

          Show
          nkeywal Nicolas Liochon added a comment - Why does master not see SPLITTING? Because it does not have a watcher? It cannot create a znode when there is a SPLITTING znode in place though. That is what we want, right? Ditto for the OPENING? Does the master not have a watcher on the OPENING znode? I need to explore the splitting case, but there are no real reason imho (at least not a ZK reason). testMultipleOpen I'm on this one (see comment above), it's not easy. The fix is trivial. What it is revealing is not.
          Hide
          nkeywal Nicolas Liochon added a comment -

          v2 fixes the issue. I've just removed the code above, I don't see how it can possibly work: if we were previously opening the region, we're updating the RIT state to closing, but without stopping the opening or starting a close handler.

          Show
          nkeywal Nicolas Liochon added a comment - v2 fixes the issue. I've just removed the code above, I don't see how it can possibly work: if we were previously opening the region, we're updating the RIT state to closing, but without stopping the opening or starting a close handler.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12561764/7390.v2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor

          -1 core zombie tests. There are zombie tests. See build logs for details.

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561764/7390.v2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor -1 core zombie tests . There are zombie tests. See build logs for details. Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3617//console This message is automatically generated.
          Hide
          nkeywal Nicolas Liochon added a comment -

          Seems to be ok ok imho.

          For testCloseByMasterWithoutZNode, it fails because we update the znode state after doing the work. So when we discover there is no znode, the close is done.

          It easy to fix by adding a test at the beginning of the close.
          As a consequence:

          • broken situation where we were calling a close without znode will now lead to an error
          • we will be a little bit slower, as we will have another zk sync & read on the critical path

          If we're in line with what was done by Jimmy Xiang previously (i.e. checking carefully what we're doing ), we should do it.

          Toughts?

          Show
          nkeywal Nicolas Liochon added a comment - Seems to be ok ok imho. For testCloseByMasterWithoutZNode, it fails because we update the znode state after doing the work. So when we discover there is no znode, the close is done. It easy to fix by adding a test at the beginning of the close. As a consequence: broken situation where we were calling a close without znode will now lead to an error we will be a little bit slower, as we will have another zk sync & read on the critical path If we're in line with what was done by Jimmy Xiang previously (i.e. checking carefully what we're doing ), we should do it. Toughts?
          Hide
          nkeywal Nicolas Liochon added a comment -

          v3 with everything included. Ready for review

          Show
          nkeywal Nicolas Liochon added a comment - v3 with everything included. Ready for review
          Hide
          sershe Sergey Shelukhin added a comment -
            protected boolean closeRegion(HRegionInfo region, final boolean abort,
                final boolean zk)  {
              try {
                return closeRegion(region, abort, zk, -1, null);
              } catch (RegionAlreadyInTransitionException e) {
                return false; // The region was already in transition for opening
              }
          

          One closeRegion throws, and the other returns false, to indicate the same condition. Is this intended?

          }catch (RuntimeException t){
          

          What is the RuntimeException here?

             * @param expectedVersion expecpted version og the znode
          

          "of", "expected"

          Show
          sershe Sergey Shelukhin added a comment - protected boolean closeRegion(HRegionInfo region, final boolean abort, final boolean zk) { try { return closeRegion(region, abort, zk, -1, null ); } catch (RegionAlreadyInTransitionException e) { return false ; // The region was already in transition for opening } One closeRegion throws, and the other returns false, to indicate the same condition. Is this intended? } catch (RuntimeException t){ What is the RuntimeException here? * @param expectedVersion expecpted version og the znode "of", "expected"
          Hide
          nkeywal Nicolas Liochon added a comment -

          @Stack
          For SPLITTING and OPENING, I misread your point in you first comment. Yes, we may miss the state change because of ZooKeeper API: when we receive a watch, we need to do a getData. In the meantime, the value may have changed. The scenario is, for example

          • Master creates a znode with M_ZK_REGION_OFFLINE and watch it
          • Regionserver updates the znode to RS_ZK_REGION_OPENING
          • Master receives the event
          • Regionserver updates the znode to RS_ZK_REGION_OPENED
          • Master does a getData, and reads RS_ZK_REGION_OPENED

          In this scenario, the master missed the RS_ZK_REGION_OPENING. So in the master code you must be ready to miss some transitions. The trunk code is actually compatible with this, but the reason is not documented.

          However, I don't know why we tickle RS_REGION_SPLIT as we do in the code.

          Show
          nkeywal Nicolas Liochon added a comment - @ Stack For SPLITTING and OPENING, I misread your point in you first comment. Yes, we may miss the state change because of ZooKeeper API: when we receive a watch, we need to do a getData. In the meantime, the value may have changed. The scenario is, for example Master creates a znode with M_ZK_REGION_OFFLINE and watch it Regionserver updates the znode to RS_ZK_REGION_OPENING Master receives the event Regionserver updates the znode to RS_ZK_REGION_OPENED Master does a getData, and reads RS_ZK_REGION_OPENED In this scenario, the master missed the RS_ZK_REGION_OPENING. So in the master code you must be ready to miss some transitions. The trunk code is actually compatible with this, but the reason is not documented. However, I don't know why we tickle RS_REGION_SPLIT as we do in the code.
          Hide
          stack stack added a comment -

          Nicolas Liochon I follow you now. Yes, we can miss events and yes the master deals (and yes no harm in improving doc).

          IIRC (I believe I added SPLIT), we tickle in case the transition to SPLIT from SPLITTING is missed by the master. We keep tickling till master acknowledges the SPLIT message (by removing SPLIT znode – haven't looked at code).

          Show
          stack stack added a comment - Nicolas Liochon I follow you now. Yes, we can miss events and yes the master deals (and yes no harm in improving doc). IIRC (I believe I added SPLIT), we tickle in case the transition to SPLIT from SPLITTING is missed by the master. We keep tickling till master acknowledges the SPLIT message (by removing SPLIT znode – haven't looked at code).
          Hide
          nkeywal Nicolas Liochon added a comment -

          Yes, that's what we do. But we should never miss the final event, so it's a little bit strange. But on master failover, when we iterate on the znodes, we ignore the one int the SPLIT state, that may be the reason. I need to look at this as well, I tend to think we may have issues around this part (master crash during a split).

          Show
          nkeywal Nicolas Liochon added a comment - Yes, that's what we do. But we should never miss the final event, so it's a little bit strange. But on master failover, when we iterate on the znodes, we ignore the one int the SPLIT state, that may be the reason. I need to look at this as well, I tend to think we may have issues around this part (master crash during a split).
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12561787/7390.v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestMultiParallel
          org.apache.hadoop.hbase.replication.TestReplication
          org.apache.hadoop.hbase.TestDrainingServer

          -1 core zombie tests. There are zombie tests. See build logs for details.

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561787/7390.v3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.TestDrainingServer -1 core zombie tests . There are zombie tests. See build logs for details. Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3620//console This message is automatically generated.
          Hide
          nkeywal Nicolas Liochon added a comment - - edited

          for the tests, we have the usual suspects again it seems.

          Sergey Shelukhin

          One closeRegion throws, and the other returns false, to indicate the same condition. Is this intended?

          Yeah... Questionable but acceptable imho. Don't mind changing this, but then I will need to throw an exception from the return somewhere else, and to change the return type from boolean to a 3 states types (no state, same state, different state).

          What is the RuntimeException here?

          Over (and bad) design I would say. Changed.

          "of", "expected"

          Done.

          Show
          nkeywal Nicolas Liochon added a comment - - edited for the tests, we have the usual suspects again it seems. Sergey Shelukhin One closeRegion throws, and the other returns false, to indicate the same condition. Is this intended? Yeah... Questionable but acceptable imho. Don't mind changing this, but then I will need to throw an exception from the return somewhere else, and to change the return type from boolean to a 3 states types (no state, same state, different state). What is the RuntimeException here? Over (and bad) design I would say. Changed. "of", "expected" Done.
          Hide
          stack stack added a comment -

          But on master failover, when we iterate on the znodes, we ignore the one int the SPLIT state, that may be the reason.

          Yes. In case the master went away on us, so the new master gets the SPLIT callback...and its handler does the SPLIT cleanup.

          Comment doesn't enough:

             * Transitions an existing node for the specified region which is
             * currently in the SPLITTING state to be in the SPLIT state.  Converts the
             * ephemeral SPLITTING znode to an ephemeral SPLIT node.  Master cleans up
             * SPLIT znode when it reads it (or if we crash, zk will clean it up).
          
          Show
          stack stack added a comment - But on master failover, when we iterate on the znodes, we ignore the one int the SPLIT state, that may be the reason. Yes. In case the master went away on us, so the new master gets the SPLIT callback...and its handler does the SPLIT cleanup. Comment doesn't enough: * Transitions an existing node for the specified region which is * currently in the SPLITTING state to be in the SPLIT state. Converts the * ephemeral SPLITTING znode to an ephemeral SPLIT node. Master cleans up * SPLIT znode when it reads it (or if we crash, zk will clean it up).
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12561800/7390.v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:

          -1 core zombie tests. There are zombie tests. See build logs for details.

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561800/7390.v4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: -1 core zombie tests . There are zombie tests. See build logs for details. Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3621//console This message is automatically generated.
          Hide
          nkeywal Nicolas Liochon added a comment -

          Stack

          Yes. In case the master went away on us, so the new master gets the SPLIT callback...and its handler does the SPLIT clean-up.

          What's strange to me is that it seems similar to a RS_ZK_REGION_CLOSED, and in this case we don't retickle at all: we start the close handler immediately.

          For this patch, I am waiting for a +1 before committing, as it touches a critical part.

          Show
          nkeywal Nicolas Liochon added a comment - Stack Yes. In case the master went away on us, so the new master gets the SPLIT callback...and its handler does the SPLIT clean-up. What's strange to me is that it seems similar to a RS_ZK_REGION_CLOSED, and in this case we don't retickle at all: we start the close handler immediately. For this patch, I am waiting for a +1 before committing, as it touches a critical part.
          Hide
          sershe Sergey Shelukhin added a comment -

          +1 on v4 patch

          Show
          sershe Sergey Shelukhin added a comment - +1 on v4 patch
          Hide
          stack stack added a comment -

          What's strange to me is that it seems similar to a RS_ZK_REGION_CLOSED, and in this case we don't retickle at all: we start the close handler immediately.

          Agree we should do one or the other. The tickle does not seem necessary if when new master joins cluster and it sees a SPLIT, it does the cleanup immediately – queues the split handler as is don3 for CLOSED .

          Show
          stack stack added a comment - What's strange to me is that it seems similar to a RS_ZK_REGION_CLOSED, and in this case we don't retickle at all: we start the close handler immediately. Agree we should do one or the other. The tickle does not seem necessary if when new master joins cluster and it sees a SPLIT, it does the cleanup immediately – queues the split handler as is don3 for CLOSED .
          Hide
          stack stack added a comment -

          On the patch, what Sergey said about different closeRegion returns. Could the one that throws an exception be private?

          Would putIfAbsent be easier to read if it took an enum rather than a boolean to say whether open or close? For example, what was there previous was opaque enough and I appreciate your collecting together tests into a single method but the below could be better:

          • if (this.regionsInTransitionInRS.containsKey(region.getEncodedNameAsBytes())) {
          • LOG.warn("Received close for region we are already opening or closing; " +
          • region.getEncodedName());
            + if (putIfAbsent(region, false)){
            + // We're already closing this region.

          This below continue is right?

          + builder.addOpeningState(RegionOpeningState.OPENED);
          + continue;

          I see you do it again later if we continue... so above looks like it makes sense.

          Review comments. Some error. e.g. + * @param expectedVersion expecpted version og the znode It looks like klingon.

          I appreciate these changes:

          • String regionName)
            + String encodedRegionName)

          I always have to back up to find what what is needed making these calls. I suppose we should make an encodedRegionName type one of these days.

          Do we need to do this?

          + zkw.sync(encoded);

          Its to be 'sure'... Its expensive though...

          Nice test.

          Can you say more on this @nkeywal "This code there is very very smart: if there is a open in progress, it changes the internal state to close, then raises an exception through the call to checkIfRegionInTransition. As we changed the state, we will have, if we were currently opening, a message saying that we were trying to close a region already closing."

          I'm not sure I follow.

          The patch looks great, cleaning up fuzzy state.

          Let me get Jimmy to take a looksee too. He LOVEs this stuff. Jimmy Xiang Any chance of your taking a look here boss?

          Show
          stack stack added a comment - On the patch, what Sergey said about different closeRegion returns. Could the one that throws an exception be private? Would putIfAbsent be easier to read if it took an enum rather than a boolean to say whether open or close? For example, what was there previous was opaque enough and I appreciate your collecting together tests into a single method but the below could be better: if (this.regionsInTransitionInRS.containsKey(region.getEncodedNameAsBytes())) { LOG.warn("Received close for region we are already opening or closing; " + region.getEncodedName()); + if (putIfAbsent(region, false)){ + // We're already closing this region. This below continue is right? + builder.addOpeningState(RegionOpeningState.OPENED); + continue; I see you do it again later if we continue... so above looks like it makes sense. Review comments. Some error. e.g. + * @param expectedVersion expecpted version og the znode It looks like klingon. I appreciate these changes: String regionName) + String encodedRegionName) I always have to back up to find what what is needed making these calls. I suppose we should make an encodedRegionName type one of these days. Do we need to do this? + zkw.sync(encoded); Its to be 'sure'... Its expensive though... Nice test. Can you say more on this @nkeywal "This code there is very very smart: if there is a open in progress, it changes the internal state to close, then raises an exception through the call to checkIfRegionInTransition. As we changed the state, we will have, if we were currently opening, a message saying that we were trying to close a region already closing." I'm not sure I follow. The patch looks great, cleaning up fuzzy state. Let me get Jimmy to take a looksee too. He LOVEs this stuff. Jimmy Xiang Any chance of your taking a look here boss?
          Hide
          nkeywal Nicolas Liochon added a comment -

          I'm not sure I follow.

                Boolean openAction = regionsInTransitionInRS.get(encodedName);
                if (openAction != null) {         // If this region is already in transition
                  if (openAction.booleanValue()) { // and is opening         
                    regionsInTransitionInRS.replace(encodedName, openAction, Boolean.FALSE);
                    // say it's now closing
                  }
                  checkIfRegionInTransition(encodedName, CLOSE); // throw an exception
                  // we will never go there, we always throw an exception at the previous line
                  // So if we were opening, we now have a OpenRegionHandler running but the RIT state says closing.
                  // What's going to happen to the znode now?
                }
                // If we're there, it means there is nothing in RIT    
                requestCount.increment();
                LOG.info("Received close region: " + region.getRegionNameAsString() +
                  ". Version of ZK closing node:" + versionOfClosingNode +
                  ". Destination server:" + sn);
                HRegionInfo regionInfo = region.getRegionInfo();         
                checkIfRegionInTransition(encodedName, CLOSE); // We've just checked 0.0001 ms ago!
          

          I looked at it more, it's actually by design: the OpenRegionHandler checks for the content of this variable. So the behavior is: 'if we received the request to close a region we're currently opening, stop the opening but raise an exception saying we can't close again because we were already closing".
          Issues are:

          • What's going to happen with the znode? It seems it won't be changed in the openHandler for example. So it will remain forever.
          • It's a little bit racy: if we were at the end of the opening, we can get the exception but the opening will be done.
          • worse, the closeRegion call will be rejected at its very beginning, because when we try to do the "getRegionByEncodedName", it throws a "RegionIsNotOnline"
            => In many cases, the opening will continue, as if the region is opening it should not be in the onlineRegions list.
            => It could be an impact of the coprocessor implementation.
            => As a side node, as the closeRegion closes are nested, the coprocessors "preClose" will be called multiple times.
            => As another side node, the way it's coded in the other closeRegion is that we don't call the coprocessors when the region is not online
            => So it means that we don"t check the access rigts if the region is not online, it allows to cancel an opening without the AR.
          • As well, we had no test case (that's why my patch, while broken, worked).

          I'm trying to solve this in the last version.

          + zkw.sync(encoded);

          It's not absolutely necessary, but it maximizes corectness probability. It's done in all other cases as well, so not doing it will would be bad imho, as it would break consistency.

          Review comments. Some error. e.g. + * @param expectedVersion expecpted version og the znode It looks like klingon.

          Done.

          This below continue is right?

          I've changed the implementation to remove the continue. I found another issue while doing so. Previously, when the region was already in RIT and closing, we were saying "OPENED" on a bulkAssign (that is: the query was open, and we were saying 'opened' while closing the region). This seems wrong. We now say "FAILED_OPENING" in this case.

          On the patch, what Sergey said about different closeRegion returns. Could the one that throws an exception be private?

          I looked at it in more details. Actually, what returns this function is never checked. So I've renamed it and clean stuff around it.

          Would putIfAbsent be easier to read if it took an enum rather than a boolean to say whether open or close

          It's was to be compatible with RIT design. I removed the method. I tried to change RIT, but its type is public through the getter, so I prefer not doing it in a patch.

          I will publish an updated version soon.

          Show
          nkeywal Nicolas Liochon added a comment - I'm not sure I follow. Boolean openAction = regionsInTransitionInRS.get(encodedName); if (openAction != null ) { // If this region is already in transition if (openAction.booleanValue()) { // and is opening regionsInTransitionInRS.replace(encodedName, openAction, Boolean .FALSE); // say it's now closing } checkIfRegionInTransition(encodedName, CLOSE); // throw an exception // we will never go there, we always throw an exception at the previous line // So if we were opening, we now have a OpenRegionHandler running but the RIT state says closing. // What's going to happen to the znode now? } // If we're there, it means there is nothing in RIT requestCount.increment(); LOG.info( "Received close region: " + region.getRegionNameAsString() + ". Version of ZK closing node:" + versionOfClosingNode + ". Destination server:" + sn); HRegionInfo regionInfo = region.getRegionInfo(); checkIfRegionInTransition(encodedName, CLOSE); // We've just checked 0.0001 ms ago! I looked at it more, it's actually by design: the OpenRegionHandler checks for the content of this variable. So the behavior is: 'if we received the request to close a region we're currently opening, stop the opening but raise an exception saying we can't close again because we were already closing". Issues are: What's going to happen with the znode? It seems it won't be changed in the openHandler for example. So it will remain forever. It's a little bit racy: if we were at the end of the opening, we can get the exception but the opening will be done. worse, the closeRegion call will be rejected at its very beginning, because when we try to do the "getRegionByEncodedName", it throws a "RegionIsNotOnline" => In many cases, the opening will continue, as if the region is opening it should not be in the onlineRegions list. => It could be an impact of the coprocessor implementation. => As a side node, as the closeRegion closes are nested, the coprocessors "preClose" will be called multiple times. => As another side node, the way it's coded in the other closeRegion is that we don't call the coprocessors when the region is not online => So it means that we don"t check the access rigts if the region is not online, it allows to cancel an opening without the AR. As well, we had no test case (that's why my patch, while broken, worked). I'm trying to solve this in the last version. + zkw.sync(encoded); It's not absolutely necessary, but it maximizes corectness probability. It's done in all other cases as well, so not doing it will would be bad imho, as it would break consistency. Review comments. Some error. e.g. + * @param expectedVersion expecpted version og the znode It looks like klingon. Done. This below continue is right? I've changed the implementation to remove the continue. I found another issue while doing so. Previously, when the region was already in RIT and closing, we were saying "OPENED" on a bulkAssign (that is: the query was open, and we were saying 'opened' while closing the region). This seems wrong. We now say "FAILED_OPENING" in this case. On the patch, what Sergey said about different closeRegion returns. Could the one that throws an exception be private? I looked at it in more details. Actually, what returns this function is never checked. So I've renamed it and clean stuff around it. Would putIfAbsent be easier to read if it took an enum rather than a boolean to say whether open or close It's was to be compatible with RIT design. I removed the method. I tried to change RIT, but its type is public through the getter, so I prefer not doing it in a patch. I will publish an updated version soon.
          Hide
          stack stack added a comment -

          Nice analysis @nkeywal. I am trying to follow but cannot (I think that is your point – smile). When you say '// If we're there, it means there is nothing in RIT ' – is the RIT in zk? Under 'unassigned' dir? Can you see who is responsible for the logic w/ git/svn blame? Maybe their insight would help w/ the unravelling? I like your list of problems w/ current implementation.

          Show
          stack stack added a comment - Nice analysis @nkeywal. I am trying to follow but cannot (I think that is your point – smile). When you say '// If we're there, it means there is nothing in RIT ' – is the RIT in zk? Under 'unassigned' dir? Can you see who is responsible for the logic w/ git/svn blame? Maybe their insight would help w/ the unravelling? I like your list of problems w/ current implementation.
          Hide
          nkeywal Nicolas Liochon added a comment -

          @Stack
          It's quite complicated . I think I'm seeing the end, or at least a checkpoint. It's the region server RIT: 3 lines above, we already checked the content, so this line is likely useless.

          Show
          nkeywal Nicolas Liochon added a comment - @ Stack It's quite complicated . I think I'm seeing the end, or at least a checkpoint. It's the region server RIT: 3 lines above, we already checked the content, so this line is likely useless.
          Hide
          nkeywal Nicolas Liochon added a comment -

          v6. Likely not final, but enough to be tested and global feedback.

          Show
          nkeywal Nicolas Liochon added a comment - v6. Likely not final, but enough to be tested and global feedback.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12562109/7390.v6.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3655//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12562109/7390.v6.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3655//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12562114/7390.v7.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 21 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.TestDrainingServer

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12562114/7390.v7.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 21 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.TestDrainingServer Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3656//console This message is automatically generated.
          Hide
          nkeywal Nicolas Liochon added a comment -

          v7. More bug fixes & more comments. I'm gonna put it on the review board.

          Show
          nkeywal Nicolas Liochon added a comment - v7. More bug fixes & more comments. I'm gonna put it on the review board.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12562325/7390.v8.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          -1 core zombie tests. There are 2 zombie test(s):

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12562325/7390.v8.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 5 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 2 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in . -1 core zombie tests . There are 2 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3684//console This message is automatically generated.
          Hide
          nkeywal Nicolas Liochon added a comment -

          It's on a review board, there it a +1 from Nick there.
          Is there anyone else reviewing it? If not, I will commit next Monday.

          Show
          nkeywal Nicolas Liochon added a comment - It's on a review board, there it a +1 from Nick there. Is there anyone else reviewing it? If not, I will commit next Monday.
          Hide
          jxiang Jimmy Xiang added a comment -

          Here is the link to the review: https://reviews.apache.org/r/8754/

          I will take a look today.

          Show
          jxiang Jimmy Xiang added a comment - Here is the link to the review: https://reviews.apache.org/r/8754/ I will take a look today.
          Hide
          nkeywal Nicolas Liochon added a comment -

          Thanks Nick & Jimmy.
          v9 contains the fixes for your points. I'm testing it here, and if it's ok I will commit it.

          Show
          nkeywal Nicolas Liochon added a comment - Thanks Nick & Jimmy. v9 contains the fixes for your points. I'm testing it here, and if it's ok I will commit it.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12562600/7390.v9.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler

          -1 core zombie tests. There are 2 zombie test(s):

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12562600/7390.v9.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 5 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. -1 javadoc . The javadoc tool appears to have generated 2 warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler -1 core zombie tests . There are 2 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3738//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12562721/7390.v10.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 11 new or modified tests.

          +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces lines longer than 100

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestFromClientSide

          -1 core zombie tests. There are 2 zombie test(s):

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12562721/7390.v10.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 11 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces lines longer than 100 -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide -1 core zombie tests . There are 2 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3766//console This message is automatically generated.
          Hide
          stack stack added a comment -

          Patch is great. You have two +1s up in RB. Here is another +1 from me... The tests are great as are the comments.

          You have lines > 100 characters according to the hadoopqa output!

          This is no longer deprecated?

          • * @deprecated use {@link #getRegionLocation(byte [], boolean)}

            instead

          This should not be javadoc:

          + *
          + * Protected to ease testing.

          Could be a comment inside the method.

          Does it have to be this explicit type:

          + public ConcurrentSkipListMap<byte[], Boolean> getRegionsInTransitionInRS();

          Can it be NavigableMap or SortedMap? (Not important).

          Show
          stack stack added a comment - Patch is great. You have two +1s up in RB. Here is another +1 from me... The tests are great as are the comments. You have lines > 100 characters according to the hadoopqa output! This is no longer deprecated? * @deprecated use {@link #getRegionLocation(byte [], boolean)} instead This should not be javadoc: + * + * Protected to ease testing. Could be a comment inside the method. Does it have to be this explicit type: + public ConcurrentSkipListMap<byte[], Boolean> getRegionsInTransitionInRS(); Can it be NavigableMap or SortedMap? (Not important).
          Hide
          nkeywal Nicolas Liochon added a comment -

          You have lines > 100 characters according to the hadoopqa output!

          Hadoopqa is right . I will fix it on commit.

          This is no longer deprecated?

          It was a little bit strange to have this one deprecated while the ones with two parameters / same method name were not. So I thought it was better to de-deprecate it.

          This should not be javadoc:

          Hum, I thought it made sense to have it in the javadoc: someone using (vs. developing) HBase would need to know that if a method is not private, it's not because it's supposed to be used or extended, but only for internal testing.

          public ConcurrentSkipListMap<byte[], Boolean> getRegionsInTransitionInRS();

          I kept the change even if iirc I finally don't need it for this patch. What was useful was pure concurrent oriented stuff like functions putif-absent or removeif-absent. So the best interface would be ConcurrentMap. I will change this on commit. It's beyond the scope of this patch, but the RIT should be more encapsulated, especially now that we sometimes hijack it to cancel an open... This makes me think as well that the best implementation class would be ConcurrentHashMap instead of ConcurrentSkipListMap. But's it will be for another patch as well

          Thanks for the review, Stack, I will commit this tomorrow (and start to go forward with the other jira I had on assignment: the depend on this one).

          Show
          nkeywal Nicolas Liochon added a comment - You have lines > 100 characters according to the hadoopqa output! Hadoopqa is right . I will fix it on commit. This is no longer deprecated? It was a little bit strange to have this one deprecated while the ones with two parameters / same method name were not. So I thought it was better to de-deprecate it. This should not be javadoc: Hum, I thought it made sense to have it in the javadoc: someone using (vs. developing) HBase would need to know that if a method is not private, it's not because it's supposed to be used or extended, but only for internal testing. public ConcurrentSkipListMap<byte[], Boolean> getRegionsInTransitionInRS(); I kept the change even if iirc I finally don't need it for this patch. What was useful was pure concurrent oriented stuff like functions putif-absent or removeif-absent. So the best interface would be ConcurrentMap. I will change this on commit. It's beyond the scope of this patch, but the RIT should be more encapsulated, especially now that we sometimes hijack it to cancel an open... This makes me think as well that the best implementation class would be ConcurrentHashMap instead of ConcurrentSkipListMap. But's it will be for another patch as well Thanks for the review, Stack, I will commit this tomorrow (and start to go forward with the other jira I had on assignment: the depend on this one).
          Hide
          stack stack added a comment -

          Above sounds reasonable to me.

          Regards encapsulating RIT, Jimmy did this for the server-side nicely.... makes sense that RS could get same treatment.

          Show
          stack stack added a comment - Above sounds reasonable to me. Regards encapsulating RIT, Jimmy did this for the server-side nicely.... makes sense that RS could get same treatment.
          Hide
          nkeywal Nicolas Liochon added a comment -

          v11 is what I committed.

          Show
          nkeywal Nicolas Liochon added a comment - v11 is what I committed.
          Hide
          hudson Hudson added a comment -

          Integrated in HBase-TRUNK #3676 (See https://builds.apache.org/job/HBase-TRUNK/3676/)
          HBASE-7390 Add extra test cases for assignement on the region server and fix the related issues (Revision 1427064)

          Result = FAILURE
          nkeywal :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerNoMaster.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          hudson Hudson added a comment - Integrated in HBase-TRUNK #3676 (See https://builds.apache.org/job/HBase-TRUNK/3676/ ) HBASE-7390 Add extra test cases for assignement on the region server and fix the related issues (Revision 1427064) Result = FAILURE nkeywal : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerNoMaster.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          hudson Hudson added a comment -

          Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #321 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/321/)
          HBASE-7390 Add extra test cases for assignement on the region server and fix the related issues (Revision 1427064)

          Result = FAILURE
          nkeywal :
          Files :

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerNoMaster.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java
          • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Show
          hudson Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #321 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/321/ ) HBASE-7390 Add extra test cases for assignement on the region server and fix the related issues (Revision 1427064) Result = FAILURE nkeywal : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerNoMaster.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
          Hide
          stack stack added a comment -

          Marking closed.

          Show
          stack stack added a comment - Marking closed.

            People

            • Assignee:
              nkeywal Nicolas Liochon
              Reporter:
              nkeywal Nicolas Liochon
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development