Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
3.0.0-alpha-1, 2.0.0
-
None
Description
Before HBASE-25187, we found there are regionserver JVM crashing problems on our production clusters, the coredump infos are as follows,
Stack: [0x00007f621ba8d000,0x00007f621bb8e000], sp=0x00007f621bb8c0e0, free space=1020k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) J 10829 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getTimestamp()J (9 bytes) @ 0x00007f6a5ee11b2d [0x00007f6a5ee11ae0+0x4d] J 22844 C2 org.apache.hadoop.hbase.regionserver.HRegion.doCheckAndRowMutate([B[B[BLorg/apache/hadoop/hbase/filter/CompareFilter$CompareOp;Lorg/apache/hadoop/hbase/filter/ByteArrayComparable;Lorg/apache/hadoop/hbase/client/RowMutations;Lorg/apache/hadoop/hbase/client/Mutation;Z)Z (540 bytes) @ 0x00007f6a60bed144 [0x00007f6a60beb320+0x1e24] J 17972 C2 org.apache.hadoop.hbase.regionserver.RSRpcServices.checkAndRowMutate(Lorg/apache/hadoop/hbase/regionserver/Region;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;[B[B[BLorg/apache/hadoop/hbase/filter/CompareFilter$CompareOp;Lorg/apache/hadoop/hbase/filter/ByteArrayComparable;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;)Z (312 bytes) @ 0x00007f6a5f4a7ed0 [0x00007f6a5f4a6f40+0xf90] J 26197 C2 org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(Lorg/apache/hbase/thirdparty/com/google/protobuf/RpcController;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MultiRequest;)Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MultiResponse; (644 bytes) @ 0x00007f6a61538b0c [0x00007f6a61537940+0x11cc] J 26332 C2 org.apache.hadoop.hbase.ipc.RpcServer.call(Lorg/apache/hadoop/hbase/ipc/RpcCall;Lorg/apache/hadoop/hbase/monitoring/MonitoredRPCHandler;)Lorg/apache/hadoop/hbase/util/Pair; (566 bytes) @ 0x00007f6a615e8228 [0x00007f6a615e79c0+0x868] J 20563 C2 org.apache.hadoop.hbase.ipc.CallRunner.run()V (1196 bytes) @ 0x00007f6a60711a4c [0x00007f6a60711000+0xa4c] J 19656% C2 org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(Ljava/util/concurrent/BlockingQueue;Ljava/util/concurrent/atomic/AtomicInteger;)V (338 bytes) @ 0x00007f6a6039a414 [0x00007f6a6039a320+0xf4] j org.apache.hadoop.hbase.ipc.RpcExecutor$1.run()V+24 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub
I have made a UT to reproduce this error, it can occur 100%。
After HBASE-25187,the check result of the checkAndMutate will be false, because it read wrong/dirty data from the released ByteBuff.
Attachments
Issue Links
- causes
-
HBASE-26777 BufferedDataBlockEncoder$OffheapDecodedExtendedCell.deepClone throws UnsupportedOperationException
- Resolved
- fixes
-
HBASE-27267 Delete causes timestamp to be negative
- Resolved
- relates to
-
PHOENIX-6658 Replace HRegion.get() calls
- Resolved
- links to