Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
0.90.7, 0.92.1, 0.94.0, 0.95.2
-
None
-
Reviewed
-
Adds a sequence id that is incremented on each Scanner#next invocation so we can distingush calls that arrive out of order. Throws OutOfOrderScannerNextException when we encounter the latter case.
Description
I'm seeing the following behavior:
- set RPC timeout to a short value
- call next() for some batch of rows, big enough so the client times out before the result is returned
- the HConnectionManager stuff will retry the next() call to the same server. At this point, one of two things can happen: 1) the previous next() call will still be processing, in which case you get a LeaseException, because it was removed from the map during the processing, or 2) the next() call will succeed but skip the prior batch of rows.
Attachments
Attachments
- HBASE-5974_0.94.patch
- 16 kB
- Anoop Sam John
- HBASE-5974_94-V2.patch
- 18 kB
- Anoop Sam John
- HBASE-5974_94-V3.patch
- 17 kB
- Anoop Sam John
- 5974_94-V4.patch
- 18 kB
- Ted Yu
- 5974_trunk.patch
- 30 kB
- Anoop Sam John
- 5974_trunk-V2.patch
- 30 kB
- Anoop Sam John
- 5974_trunk-V3.patch
- 32 kB
- Anoop Sam John
Activity
Hi Anoop. I was thinking something similar to that – but just using a long "cookie" instead of passing the actual keys around. So the conversation between client and server becomes:
C->S: openScanner(Scan)
S->C: return scanner ID, cookie = 0
C->S: next(10, cookie=0)
S->C: results: ... newCookie=1
C->S: next(10, cookie=1) [ but times out ]
S->C: results: ... newCookie = 2 [ but never gets received ]
C->S: next(10, cookie=1) (retry)
S->C: NACK: Bad cookie!
mm sounds a long cookie is enough.. There might not be a case of more number of request response, btw Client and one region (for one scan), than the long max value..
Good catch of a long living bug Todd..
Yea, actually an int would probably do - we shouldn't ever have 4B rows in a region. And even if it wrapped, the time it would take to do so would prevent any possible case where an old request got incorrectly allowed.
Anoop: want to work on this?
+1 on the cookie approach.
We've seen the LeaseExceptions. I didn't realize that in some cases the next next() would skip a batch of rows, that's quite bad.
We've seen the LeaseExceptions. I didn't realize that in some cases the next next() would skip a batch of rows, that's quite bad.
On a LeaseException I think we do the right thing - the client detects the error at the scanner level and can restart the scanner from the right spot. That's because the lease exception is a DoNotRetryIOException. But other exceptions like RPC timeouts or IO errors would get retried incorrectly with the bug described here.
In the scenario Todd described, can the server keep the most recent batch of results so that when server sees cookie from client carrying 1 less than expected value, the server can resend the stored batch ?
In the scenario Todd described, can the server keep the most recent batch of results so that when server sees cookie from client carrying 1 less than expected value, the server can resend the stored batch ?
I think it's too expensive in terms of memory consumption on the server. My assertion is that this issue is very rare (or else people probably would have noticed the missing rows a lot more often). So, keeping the last response batch cached server side seems a poor use of RAM.
My assertion is that this issue is very rare
We saw the LeaseExceptions frequently, so perhaps not. Critical priority is correct here I think.
LeaseExceptions don't cause this issue, though. They are DoNotRetryIOExceptions. It's only IOEs that cause the issue.
LeaseExceptions don't cause this issue, though.
Our environment is pretty good for causing IPC issues occasionally.
Unfortunately I can't comment on the frequency that IOEs caused missing rows, but I can testify to getting complaints about frequent LeaseExceptions for a time.
Suppose client gets 'Bad cookie' notification from server in the middle of a long scan.
If the client wants to continue where it left off in a new scanner (I assume this is the default / desriable behavior), the client needs to remember the key of the last KV it previously received and .
Using cookie saves bandwidth. But in faulty condition, I think the key of last KV would still be needed.
Suppose client gets 'Bad cookie' notification from server in the middle of a long scan.
If the client wants to continue where it left off in a new scanner (I assume this is the default / desriable behavior), the client needs to remember the key of the last KV it previously received and .
Yes. The client knows the last row and based on this it can create a new scanner so as to continue the scan from that point. This is already in place and this what will happen when client gets a NSRE in the middle of the scan from RS.
But in faulty condition, I think the key of last KV would still be needed.
Any specific reason in your mind Ted? As such I think a seq numbering is enough...
Trying to attach patch to JIRA but seems some issue with JIRA. I am getting some error dialog box instead of the attach dialog box !!
One point came up when I and Ram discussed
In ScannerCallable.call()
When a RemoteException comes, can we mark that as DoNotRetryIOException?
Any way the call reached the RS and it might have done atleast partial scan.[Moved the currentRow pointer]
Patch for 94. 1st version.. there are some points to discuss which will be done in following comments
+ if (-1 != cookie && cookie != rswc.cookie) { + this.scanners.remove(scannerName); + this.leases.cancelLease(scannerName); + throw new ScannerCookieOutOfOrderException("Cookie from client and cookie at server do not match"); + }
When the cookie is out of order, we remove the scanner and cancel Lease.[Not waiting for the lease timeout] This is in same line as what happens with NSRE. One question which arise in my mind is do we need to contact the CP hooks of pre and post scanner close here?
Now in normal close() of scanner from client side and lease timeout based close, we call CP hooks. But what abt the scanner getting removed from RS when the NSRE happens. [Same applicable to ScannerCookieOutOfOrderException case too]
When a RemoteException comes, can we mark that as DoNotRetryIOException?
I think it should be marked as DNRI.
Now in normal close() of scanner from client side and lease timeout based close, we call CP hooks. But what abt the scanner getting removed from RS when the NSRE happens.
So what's your suggestion, Anoop? call CP hooks in the finally section?
The patch looks nice. A minor comment:
RegionScanner scanner = scanners.get(scannerIdString).s;
Is there one chance of "scanners.get(scannerIdString)" is null? any way, the direct access to that private variable is not good.
@Todd/Stack
Can you review this patch? Seems to be pretty important fix.
- style: use 2-space indentation, not tabs
- why do we need the new RegionScannerWithCookie class? why not add the cookie to RegionScanner itself? I'd also move the cookie check into a function "scanner.checkAndIncrementCookie(cookie);"
- let's rename "cookie" to "callSequenceNumber" or something – since in this implementation it is a simple increasing sequence.
- this isn't currently compatible with 0.94, since a new client wouldn't be able to scan an old server. In order to make it compatible, we'd need to catch the MethodNotExists exception and set a flag in the client which disables cookie usage for that cluster.
- In the test, I think you should use HRegionInterface directly, so you don't have to actually generate an RPC timeout. As is, I think it's also not guaranteed to trigger the issue unless you set scanner caching to 1, right? I guess that's the default, but if we ever change the default, it would invalidate this test.
Thanks for the review Todd
why do we need the new RegionScannerWithCookie class? why not add the cookie to RegionScanner itself?
I was also thinking initially in this way. There are 2 reasons why I have avoided to do the seqNo work within the RegionScanner
1. In case of the caching>1 there will be more than one call to the RegionScanner.next(). U mean passing the client sent seqNo ( I am avoiding cookie as I agree with you to rename this ) to the RegionScanner which will change the interface. This is exposed
2. This is the main reason. With the CP usage we have exposed the RegionScanner and using the preScannerOpen() and postScannerOpen() impls user can now return his own RegionScanner impl. If we do this seqNo maintain and check logics in RegionScanner this will make the user to worry abt these? I feel this should be handled by HBase core code. What do u say?
this isn't currently compatible with 0.94, since a new client wouldn't be able to scan an old server.
Agree.. I can fix this
let's rename "cookie" to "callSequenceNumber"
Already agreed..
In the test, I think you should use HRegionInterface directly, so you don't have to actually generate an RPC timeout.
I thought of an E2E FT case.. Yes as u said the other one also I can write. So what is your recommendation? Should I change?
As is, I think it's also not guaranteed to trigger the issue unless you set scanner caching to 1, right?
May be in that case I can explicitly set the caching=1 for this test case. I can do that
Thanks for the review Jieshan
So what's your suggestion, Anoop? call CP hooks in the finally section?
I mean in whatever case when we close the scanner we need to call the CP hooks. Currently before this patch we were not doing this when getting a NSRE
catch (Throwable t) { if (t instanceof NotServingRegionException) { this.scanners.remove(scannerName); } throw convertThrowableToIOE(cleanup(t)); }
Here we can see it is not calling the CP hooks. As of now in case of the cookie out of order also I am not contacting the CP hooks.
RegionScanner scanner = scanners.get(scannerIdString).s;
Oh yes. Thanks for pointing it out. I will fix.. This was not in that direct next() call flow.. That is why I missed..
@Todd
this isn't currently compatible with 0.94, since a new client wouldn't be able to scan an old server.
Do we need to keep this way of compatibility also? I thought we need to have like old client and new server. Just wanted to confirm this.
@Todd
Regarding the solution for handling the compatibility with client,
When NoSuchMethodException happens at RS side, the IPC layer throws back the RemoteException in which the actual Exception is stringfied and set as the message. So if we need to handle, we need to do string matching on the Exception message at the client side. Pls give your opinion on this. Or can we mark this issue fix such that when the server side is upgraded the client side also need upgrade..
Other devs also pls give you suggestion
I have prepared a patch addressing the other comments and if we decide on the above I can submit the new patch
@Todd
Also regarding the comment
https://issues.apache.org/jira/browse/HBASE-5974?focusedCommentId=13275279&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13275279
What do you feel?
Addressed the comments
The comment regarding the compatibility at client side, I have fixed with the string check on the RemoteException stacktrace. I am not happy with this way but there is no other way as of now I can find. Pls give your comments
Should CallSequenceOutOfOrderException be extending DoNotRetryIOException ?
That way you don't need to create a new exception below:
+ } else if (ioe instanceof CallSequenceOutOfOrderException) { ... + throw new DoNotRetryIOException("Reset scanner", ioe);
I think users haven't experienced this bug.
In solving the bug, some kludge is introduced.
We should think twice before integration.
Last patch looks good to me. The string check on the RemoteException is not ideal, but I cannot think of anything better; somebody with more knowledge about our RPC should chime in.
Is the RegionScannerHolder needed? Why can't RegionScannerImpl not hold the sequence number and RegionScanner get a get/setSeq method?
Thanks for the review Lars
Is the RegionScannerHolder needed? Why can't RegionScannerImpl not hold the sequence number and RegionScanner get a get/setSeq method?
Why I was not doing that because through the CP we have exposed the RegionScanner interface. The pre and post scannerOpen methods in the CP can return an impl for the RegionScanner now. So in that case customer need to worry abt maintain this seqNo? I felt that wont be good. Am I clear to you Lars?
Yes the string based check, I am not getting any way. When the NoSuchMethodException happens at the RS side, we are creating RemoteException not wrapping this NoSuchMethodException but with a message=exception trace...
W.r.t. to RegionScannerHolder. I see that makes sense. As long as the sequence numbering also works when the CP intercepts a scanner and provides its own implementation. I don't think there'd be a case where a CP would want to interfere with the sequence numbers.
Re: the NoSuchMethodException check. I was wondering if there is anything with RPC version numbering to deal with this instead.
@Lars
Or is it this kind of a compatibility is not needed? Only upgradation at client side and not at server. I am not sure what kind of compatibility we need to maintain across new minor versions (from 94.0 to 94.1)
Also see the comments for the issue HBASE-6124. [Similar to our case]
https://issues.apache.org/jira/browse/HBASE-6124?focusedCommentId=13285008&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13285008
@Anoop. Compatibility is definitely needed across a minor version change.
If the solution is fine with every one I can make patch for other versions also.
@Ted
Regarding the new Exception extending DoNotRetryIOException, I was following NSRE. I can make this change. I think it should be ok.
HRegionInterface.java doesn't exist in trunk so patch v2 wouldn't apply to trunk.
I would suggest creating patch for trunk and run through hadoop QA.
+ LOG.info("Seq number based scan API not present at RS side! Trying with API: "
I think the above log should be at warn level.
+ } else if (ioe instanceof CallSequenceOutOfOrderException) { + // The callSeq from the client not matched with the one expected at the RS side + // This means the RS might have done extra scanning of data which is not received by the + // client.Throw a DNRE so that we close the current scanner and opens a new one with RS. + throw new DoNotRetryIOException("Reset scanner", ioe);
Should we disclose a little more detail in the message of DNRIOE ? The above is the same as response to NotServingRegionException and RegionServerStoppedException.
'not matched with' -> 'does not match'
'is not received' -> 'has not been received'
'opens a new' -> 'open a new'
+ // if callSeq do not match throw Exception straight away. This needs to be performed even
'do not match' -> 'does not match'
+public class TestClientScannerRPCTimesout {^M
Please add short javadoc for the test class. I think it should be called TestClientScannerRPCTimeout.
Please use utility such as dos2unix to remove the trailing ^M from the patch file.
+ public static class RegionServerWithScanTimesout extends MiniHBaseClusterRegionServer {^M
The above class can be made private. It should be named RegionServerWithScanTimeout.
+ * Thrown by a region server while scan related next() calls. Both client and server maintain a^M + * callSequence and if the both do not match, RS will throw this exception.^M + */^M +public class CallSequenceOutOfOrderException extends IOException {^M
CallSequenceOutOfOrderException should extend DoNotRetryIOException so that we don't need to create DoNotRetryIOException instance (shown above).
'while scan related next()' -> 'while doing scan related next()'
'the both do not' -> 'they do not'
It would be nice for Todd to take a look at the patch.
w.r.t. keeping RegionScannerHolder, I posted a poll to dev@hbase for use case of letting [pre,post]ScannerOpen() return a custom RegionScanner implementation.
@Ted
Will look into your comments and need a rebase for 94 patch too..
I will make seperate patch fro trunk also.
wrt pre and post CP hooks, it is not only creating a new custom RegionScanner, but may be creating a wrapper for the actual RegionScanner. In one of our impl, we use this approach. [Just a wrapper which delegates the calls with some extra steps for next() calls. ] Here also if we add new methods to RegionScanner interface which deals with the check and incerement for this seqNo, will get exposed to user. I felt this might look odd for them.
Patch addressing Ted's comments
The above class can be made private. It should be named RegionServerWithScanTimeout.
The class name is changed. But we can not make this private. If so RS impl class can not get instantiated.
Patch v4 makes a small change to JVMClusterUtil.java so that RegionServerWithScanTimeout can be made private.
TestClientScannerRPCTimeout passes.
w.r.t. potential change to RegionScanner, if users create wrapper(s), the maintenance of seqNo would still be completed by core implementation.
See the following in patch:
+ public Result[] next(final long scannerId, int nbRows) throws IOException { + return next(scannerId, nbRows, -1); + }
and the following in current code base:
@InterfaceAudience.Private public interface RegionScanner extends InternalScanner {
@Ted
Pls see the below comment from Andrew Purtell
https://issues.apache.org/jira/browse/HBASE-5517?focusedCommentId=13225624&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13225624
I think we need to change the RegionScanner from private to public then.
In Trunk InterfaceAudience for RegionScanner to be public I feel ( as per the use case that we are having. We wants to impl this interface wrapping the actual RegionScanner object created by the Region ).. Not made this change in the attached patch. Pls provide suggestions
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12530781/5974_trunk.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.regionserver.TestAtomicOperation
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2097//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2097//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2097//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2097//console
This message is automatically generated.
@Andrew and @Todd
Can you have a look at this pls? Also regarding the point of RegionScanner exposing through CP and its use cases, can you give your opinion?
I don't think we should be trying to lock down internal APIs like RegionScanner. Coprocessors are an advanced interface and I think we should not endeavour to provide full compatibility between versions for them. We'll end up getting bogged down way too fast trying to keep this compatibility everywhere.
So, I'm still in favor of just adding a new field/method to the existing RegionScanner implementation.
As for the compatibility code: I think the useCallSeq flag has to be at a wider scope than this. Having it in ScannerCallable means that each individual scan will print a warning (and have to do an extra round trip), which is too much.
As for the compatibility code: I think the useCallSeq flag has to be at a wider scope than this. Having it in ScannerCallable means that each individual scan will print a warning (and have to do an extra round trip), which is too much.
I totally agree with you Todd. I thought about this while making the patch. I couldn't find an apt place for saving this info and making use. This is scan specific so should be some where in the scan code area I felt. Also my worry was can it be like in one cluster itself one RS is with new code and another with old?
Any Suggestions Todd.. As you said, extra round trip is a worry..
I don't think we should be trying to lock down internal APIs like RegionScanner. Coprocessors are an advanced interface and I think we should not endeavour to provide full compatibility between versions for them.
Rather than the compatability at interface level my worry is RegionScanner handling this seq maintain and checking job.
In our scenario we wanted custom behaviour in RegionScanner. So what we have done is created a new RegionScanner impl wrapping the actual RegionScanner object created by the HRegion. Our custom impl class implements all the methods and for the actual work we delegate the call. The extra steps we wanted to do, we are doing before this delegation.
Now we need to add a new method in the RegionScanner [like checkAndIncrement(seqNo)] which should be called by HRS.. Now in our usecase, we need to implement this and may be delegate the call. Making this very much internal item exposed to user was my worry. Hope I make it clear. We dont have any usecase like making full custom RegionScanner impl (without even the wrapped object) now. I dont know someone might really need some thing like that.[But at least the CP allows to do so now] In such a thing this impl of seqNo check need to be handled by that code itself.
In Trunk patch
HRegionServer
+ // if callSeq does not match throw Exception straight away. This needs to be performed + // even before checking of Lease. + if (rsh == null) { + rsh = scanners.get(scannerName); + } + if (rsh != null) { + if (request.getCallSeq() != rsh.callSeq) { + throw new CallSequenceOutOfOrderException("Expected seq: " + rsh.callSeq + + " But the seq got from client: " + request.getCallSeq()); + } + // Increment the callSeq value which is the next expected from client. + rsh.callSeq++; + }
Here we need to check whether the callSeq is present in the ScanRequest or not. If it is present then only we can do the comparison. I have noticed this and will a patch V2 on Trunk with this change soon
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12531221/5974_trunk-V2.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.regionserver.TestAtomicOperation
org.apache.hadoop.hbase.master.TestMasterNoCluster
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2120//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2120//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2120//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2120//console
This message is automatically generated.
2 test classes need change
TestAssignmentManager#processServerShutdownHandler()
TestMetaReaderEditorNoCluster#testRideOverServerNotRunning ()
I will give a new patch with these test case changes needed as per the patch soon.
There were some concerns over where to handle the seq no. [at HRS or at RegionScannerImpl]
I had added my reasoning ... Any thoughts?
Request opinion from Andrew also.
Let's move this to 0.94.2. We had this behaviour since the beginning.
Let's do this correctly in 0.96 (where it is OK to break wire compatibility).
Removing this from 0.94.
This means we can pass the seqno as a proper field in the Scan object.
Thanks for reminding me about this important issue Anoop. Lets get it into trunk at least (Should be a blocker on 0.96).
Minor: Do you foresee the sequencing being used other than in scanners (Its implemented in ScannerCallable only at moment)? If NOT, would OutOfOrderNextException or OutOfOrderScannerNextException be a better name than CallSequenceOutOfOrderException? If you do this, would suggest changing param names from callSeq to nextSeq? The former is generic.
Is CallSequenceOutOfOrderException an exception that will bubble up into the application or is the client its intended audience? If so, should we annotate it @InterfaceAudience.Private as Abortable is for instance?
+ if ((cause == null || (!(cause instanceof NotServingRegionException) + && !(cause instanceof RegionServerStoppedException))) + && !(e instanceof CallSequenceOutOfOrderException)) {
Would it make sense testing ! DoNotRetryIOException rather than calling out the two exceptions above? Would that be too broad? Otherwise, wondering if these two exceptions should inherit from a common base class given they are getting this special treatment. Not important. Just a thought.
Check your comments. You seem to be saying 'scan' when you mean 'next'.
For example, this comment:
+ // increment the callSeq which will be getting used for the next time scan() call to + // the RS.In case of a timeout this increment should not happen so that the next + // trial also will be done with the same callSeq.
Regards the above comment, is there a sentence missing on what happens if we go back w/ the same callseq? Maybe have another go at this whole comment... it could be a bit clearer. Thanks.
Committing this set of comments.... more to come.
Like Lars and Todd above I almost said why RegionScannerHolder? Is it because if you add this class, you can do the seqid management in the regionserver and CPs do not have to be concerned with its management? That is fair enough and this is pretty elegant soln to the issue.
Your worry is that a CP might return its own RegionScanner implementation. Can you not add to RegionScanner to methods, getNextSeq, and incrementNextSeq? Then the regionserver would have the means for keeping up the seqid in whatever the implementation rather than introduce this wrapper class? RegionScanner is marked as private. For the trunk patch, we are requiring a restart and allowing that CPs may need modification to make the leap from 0.94 to 0.96. Does that make it easier on you modifying RegionScanner to support nextSeq?
Yeah, I think that rather than:
+ optional uint64 callSeq = 6;
it should be called nextSeq or nextCallSeq (the latter might be better because nextSeq might make you think the next sequence id rather than the sequence id for the 'next' call). All references should be changed if you change this I"d say Anoop.
This test, TestClientScannerRPCTimeout, looks good. Does it have to be in a separate class? Do we not have a scanner test already that has a running minicluster? I can understand though if you want to keep this separate because its hard to integrate into a current test suite. If you are going to have a class for this single test and are going to go to the bother of spinning up a whole minicluster, I'd say do a bit more testing. Are there other scenarios you can manufacture? Can you add assert that you actually slept?
Good stuff Anoop. Lets get this important patch in.
Stack
Thanks for your review.. I will work on this today
It is been some time that I have not seen this patch Let me get a look into this
Do you foresee the sequencing being used other than in scanners (Its implemented in ScannerCallable only at moment)? If NOT, would OutOfOrderNextException or OutOfOrderScannerNextException be a better name than CallSequenceOutOfOrderException? If you do this, would suggest changing param names from callSeq to nextSeq? The former is generic.
Issues in which a retry from client as a result of timeout can come in other client API like increment() also I guess. It can increment 2 times or may be more.. Here in this particular case the names you suggested better than mine.. Sorry I am not good at naming
Is CallSequenceOutOfOrderException an exception that will bubble up into the application or is the client its intended audience? If so, should we annotate it @InterfaceAudience.Private as Abortable is for instance?
Yes you are correct. I will change
Would it make sense testing ! DoNotRetryIOException rather than calling out the two exceptions above? Would that be too broad? Otherwise, wondering if these two exceptions should inherit from a common base class given they are getting this special treatment. Not important. Just a thought.
Here 3 exceptions getting the special treatment. !DoNotRetryIOException I can try.. Didnt want to change this key area..But at the time of this patch preperation I felt like why this is been written this way. I had to add my new Exception out here to make thing work.. May be will keep the code the current way as in the patch. Will open up a new jira where we can discuss about making a common base class or other changes. I think your suggestion makes sense for maintainability
Check your comments. You seem to be saying 'scan' when you mean 'next'.
Yes I need do changes here. Will do
Will continue in next comment....
Regarding RegionScannerHolder
Yes Stack my worry was wrt CP.. In our usage also we are returning a RegionScanner from the CP. This is a wrapper over the actual RegionScannerImpl. Yes by adding the methods we can make sure that the logic is being executed by the HRS but the seqno# maintain should be within the RegionScanner. In our case we can just delegate these 2 calls to the RegionScannerImpl created by kernel. My worry was this exposes this seq stuff to the user. Ideally the seq maintain etc is our problem and to solve we introduce this stuff.. That is why I was/am in 2 minds.. If you all say that is fine, I can change.. We need to clearly document these 2 methods with its importance and what to do wrt this etc.. Considering all these my mind is more hanging towards the current way..
optional uint64 callSeq = 6;
it should be called nextSeq or nextCallSeq.... All references should be changed if you change this I"d say Anoop
Sure I will change.
This test, TestClientScannerRPCTimeout, looks good. Does it have to be in a separate class?
Let me see where I can fit this test case. Yes agree with you regarding the overhead of minicluster startup.
Also this fix is already running in our cluster.
If you all say that is fine, I can change.. We need to clearly document these 2 methods with its importance and what to do wrt this etc.
The class is marked private and used serverside only so its ok adding methods to the Interface?
But if you feel strongly about it, keep it as is. it looks a bit ugly but at least the uglyness is contained inside HRegionServer.
Make a new patch and lets get it committed. Thanks Anoop.
This test, TestClientScannerRPCTimeout, looks good. Does it have to be in a separate class?
Stack the reason why I had done it is we need to have an impl of the HRS where I can introduce some sleep so as to make the socket timeouts at the client side.
conf.setStrings(HConstants.REGION_SERVER_IMPL, RegionServerWithScanTimeout.class.getName());
Didn't want to make this change in any of the existing test cases. So is it reasonable to be have the new test class?
anoopsamjohn You going to make a final patch Anoop? Lets get this in (Thanks).
Yes Stack... I am working on this now.. Will give it today..
Sorry for the delay..
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12550776/5974_trunk-V3.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.
-1 javadoc. The javadoc tool appears to have generated 82 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in .
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3146//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3146//console
This message is automatically generated.
This is not wire compatible, right? (if it would be I'd pull it back into 0.94)
lhofhansl Correct. Adds a sequenceid on each next invocation via pb.
Integrated in HBase-TRUNK #3486 (See https://builds.apache.org/job/HBase-TRUNK/3486/)
HBASE-5974 Scanner retry behavior with RPC timeout on next() seems incorrect (Revision 1402214)
Result = FAILURE
stack :
Files :
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/OutOfOrderScannerNextException.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java
- /hbase/trunk/hbase-server/src/main/protobuf/Client.proto
- /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientScannerRPCTimeout.java
Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #236 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/236/)
HBASE-5974 Scanner retry behavior with RPC timeout on next() seems incorrect (Revision 1402214)
Result = FAILURE
stack :
Files :
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/OutOfOrderScannerNextException.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java
- /hbase/trunk/hbase-server/src/main/protobuf/Client.proto
- /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientScannerRPCTimeout.java
After the patch, if a next request is always timeout because too large, we will retry forever now, is it so?
Without the patch, we do the retry in ServerCallable#ServerCallable, but now we do the retry in ClientScanner#next which is no retry number limit
Chunhui
Not really.. I tested once more with change in test code so as to sleep all the times.
Got the exception at client side
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions:
I tested once more with change in test code so as to sleep all the times.
Anoop, if change as so, you can't produce my mentioned problem.
Do the following change in the test case TestClientScannerRPCTimeout, you could find we will retry forever
@Override public Result[] next(long scannerId, int nbRows, long callSeq) throws IOException { if (!slept && this.tableScannerId == scannerId && seqNoToSleepOn == callSeq) { throw new CallSequenceOutOfOrderException("Test"); // try { // Thread.sleep(rpcTimeout + 500); // } catch (InterruptedException e) { // } // slept = true; } return super.next(scannerId, nbRows, callSeq); }
Wait for your answer, thanks
The process is as the following:
1.A next request is very large, so first time it is failed because of timeout
2.The second time it is failed again because of CallSequenceOutOfOrderException
3.We will retry this next request again from the init startrow, so it is also very large and failed again
4.Repeated the above forever...
Chunhui
Your above mentioned steps for that the above test case change is correct?
I think for that teh change to be
public Result[] next(long scannerId, int nbRows, long callSeq) throws IOException { if ( this.tableScannerId == scannerId ) { try { Thread.sleep(rpcTimeout + 500); } catch (InterruptedException e) { } } return super.next(scannerId, nbRows, callSeq); }
Your change was always throwing CallSequenceOutOfOrderException so that at client never the timeout happens. This will happen ideally??
As in your steps, #1 and #2 is correct. Then again the new call on the new Scanner it wont throw CallSequenceOutOfOrderException any way. But it can either be successfull or timed out at server. Why my change is doing is make it time out at client side always...
Then again the new call on the new Scanner it wont throw CallSequenceOutOfOrderException any way, But it can either be successfull or timed out at server
Yes,if it timed out again because this request is too large, in the next retry we will throw CallSequenceOutOfOrderException , is it so?
I think I have done wrong in the test code change.. Yes your point is valid.. (checked with functional way ) It behaves like NSRE which is also like infinite retry...
Will open up a new JIRA and will solve the issue?
@Annop
I open a new issue HBASE-7070 and upload a simple patch, could you tak a view?
On 01/Jul/12 08:25, Anoop said:
I had added my reasoning ... Any thoughts?
There was a long period of inactivity till 18/Oct/12 when Stack picked this up.
I don't see a second +1 since 18/Oct/12.
Maybe I missed something.
ted_yu The above makes no sense. I think you are trying to find a violation of commit protocol but there is none. Go reread what we all agreed.
Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #248 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/248/)
HBASE-7070 Scanner may retry forever after HBASE-5974 (Revision 1405107)
Result = FAILURE
stack :
Files :
- /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
- /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientScannerRPCTimeout.java
lhofhansl
This is a critical bug in scan in 0.94. We should fix it in 0.94.
What's your suggestion?
Thanks
FAILURE: Integrated in HBase-0.94-JDK7 #183 (See https://builds.apache.org/job/HBase-0.94-JDK7/183/)
HBASE-11957 Backport to 0.94 HBASE-5974 Scanner retry behavior with RPC timeout on next() seems incorrect. (Liu Shaohui original patch by Anoop Sam John) (larsh: rev 8f9faabf579c02476acb791c145f34baf49ac8f5)
- src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerHolder.java
- src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
- src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java
- src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
- src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
- src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
- src/main/java/org/apache/hadoop/hbase/CallSequenceOutOfOrderException.java
- src/test/java/org/apache/hadoop/hbase/client/TestClientScannerRPCTimeout.java
FAILURE: Integrated in HBase-0.94 #1413 (See https://builds.apache.org/job/HBase-0.94/1413/)
HBASE-11957 Backport to 0.94 HBASE-5974 Scanner retry behavior with RPC timeout on next() seems incorrect. (Liu Shaohui original patch by Anoop Sam John) (larsh: rev 8f9faabf579c02476acb791c145f34baf49ac8f5)
- src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerHolder.java
- src/test/java/org/apache/hadoop/hbase/client/TestClientScannerRPCTimeout.java
- src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java
- src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
- src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
- src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
- src/main/java/org/apache/hadoop/hbase/CallSequenceOutOfOrderException.java
- src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
FAILURE: Integrated in HBase-0.94-security #524 (See https://builds.apache.org/job/HBase-0.94-security/524/)
HBASE-11957 Backport to 0.94 HBASE-5974 Scanner retry behavior with RPC timeout on next() seems incorrect. (Liu Shaohui original patch by Anoop Sam John) (larsh: rev 8f9faabf579c02476acb791c145f34baf49ac8f5)
- src/test/java/org/apache/hadoop/hbase/client/TestClientScannerRPCTimeout.java
- src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
- src/main/java/org/apache/hadoop/hbase/CallSequenceOutOfOrderException.java
- src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
- src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java
- src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerHolder.java
- src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java
- src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
So we might need a proper sequencing of the next() calls from the client to RS... Might be like the RegionScanner maintain the key of the last KV it passed to client and client in the next next() call pass back this key which is the last KV's key it received?
In the scenario which is mentioned above which might cause KVs to be skipped, these 2 keys won't match and we can close the scan at RS side and give exception which in turn can cause a new scanner creation from that point of last KV from the client side... [Some thing like how currently ClientScanner react to NSRE etc]
When the client is not passing any last key(null), there wont be any check in the RegionScanner level and scan will continue as normal like the current way. One down side is that with every scan() call to the RS, need to pass some extra data.. May be if we can distinguish the 1st attempt of call() in the ScannerCallable and the later attempts, we can avoid passing the key for the 1st RS call. Only for the later trials we pass the key...
These are some thoughts came to me after going through the code.. Your thoughts...