Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
We spent a bit of time making it so exceptions included the remote host name. Looks like we can add the target Region name too with a bit of manipulation; will help figuring hot-spotting or problem Region on serverside. For example, here is what I was seeing recently on client-side when a RS was was timing out requests:
2021-04-06T02:18:23.533Z, RpcRetryingCaller{globalStartTime=1617675482894, pause=100, maxAttempts=4}, org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to ps0989.example.org/1.1.1.1:16020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call[id=88369369,methodName=Get], waitTime=5006, rpcTimeout=5000 at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:145) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:383) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:357) ... Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to ps0989.bot.parsec.apple.com/17.58.114.206:16020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call[id=88369369,methodName=Get], waitTime=5006, rpcTimeout=5000 at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:209) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:378) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:89) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:409) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:405) at org.apache.hadoop.hbase.ipc.Call.setTimeout(Call.java:110) at org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:136) at org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:672) at org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:747) at org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:472) ... 1 more Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call[id=88369369,methodName=Get], waitTime=5006, rpcTimeout=5000 at org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:137) ... 4 more
I wanted the region it was hitting. I wanted to know if it was a server problem or a Region issue. If clients only having issue w/ one Region, then I could focus on it.
After the PR the exception (from another context) looks like this:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to address=127.0.0.1:12345, regionInfo=hbase:meta,,1.1588230740 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: error ....
Attachments
Issue Links
- breaks
-
PHOENIX-6438 Compilation failures after changes to Private annotated HBase class RpcControllerFactory
- Resolved
- causes
-
HBASE-25942 Get rid of null regioninfo in wrapped connection exceptions
- Resolved
- fixes
-
PHOENIX-6438 Compilation failures after changes to Private annotated HBase class RpcControllerFactory
- Resolved
- links to