Details
-
Sub-task
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
We use mysql innodb as the state store engine,in router log we found dead lock error like below:
[2018-05-21T15:41:40.383+08:00] [ERROR] [IPC Server handler 25 on 8050] : Unable to insert the newly generated application application_1526295230627_127402
com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at sun.reflect.GeneratedConstructorAccessor107.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.Util.getInstance(Util.java:408)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2484)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858)
at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2079)
at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2013)
at com.mysql.jdbc.PreparedStatement.executeLargeUpdate(PreparedStatement.java:5104)
at com.mysql.jdbc.CallableStatement.executeLargeUpdate(CallableStatement.java:2418)
at com.mysql.jdbc.CallableStatement.executeUpdate(CallableStatement.java:887)
at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61)
at com.zaxxer.hikari.pool.HikariProxyCallableStatement.executeUpdate(HikariProxyCallableStatement.java)
at org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore.addApplicationHomeSubCluster(SQLFederationStateStore.java:547)
Use "show engine innodb status;" command to find what happens
2018-05-21 15:41:40 7f4685870700 *** (1) TRANSACTION: TRANSACTION 241131538, ACTIVE 0 sec inserting, thread declared inside InnoDB 4999 mysql tables in use 2, locked 2 LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s) MySQL thread id 7602335, OS thread handle 0x7f46858f2700, query id 2919792534 192.168.1.138 federation executing INSERT INTO applicationsHomeSubCluster (applicationId,homeSubCluster) (SELECT applicationId_IN, homeSubCluster_IN FROM applicationsHomeSubCluster WHERE applicationId = applicationId_IN HAVING COUNT(*) = 0 ) *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131538 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; asc application_1526295230627_1274; (total 31 bytes); 1: len 6; hex 00000ba5f32d; asc -;; 2: len 7; hex dd000000280110; asc ( ;; 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;; *** (2) TRANSACTION: TRANSACTION 241131539, ACTIVE 0 sec inserting, thread declared inside InnoDB 4999 mysql tables in use 2, locked 2 4 lock struct(s), heap size 1184, 2 row lock(s) MySQL thread id 7600638, OS thread handle 0x7f4685870700, query id 2919792535 192.168.1.138 federation executing INSERT INTO applicationsHomeSubCluster (applicationId,homeSubCluster) (SELECT applicationId_IN, homeSubCluster_IN FROM applicationsHomeSubCluster WHERE applicationId = applicationId_IN HAVING COUNT(*) = 0 ) *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131539 lock mode S locks gap before rec Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; asc application_1526295230627_1274; (total 31 bytes); 1: len 6; hex 00000ba5f32d; asc -;; 2: len 7; hex dd000000280110; asc ( ;; 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 113 page no 21208 n bits 296 index `PRIMARY` of table `guldan_federationstatestore`.`applicationshomesubcluster` trx id 241131539 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 23 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 30; hex 6170706c69636174696f6e5f313532363239353233303632375f31323734; asc application_1526295230627_1274; (total 31 bytes); 1: len 6; hex 00000ba5f32d; asc -;; 2: len 7; hex dd000000280110; asc ( ;; 3: len 13; hex 686f70655f636c757374657231; asc hope_cluster1;; *** WE ROLL BACK TRANSACTION (2)
PROCEDURE sp_addApplicationHomeSubCluster will create gap lock .
insert into select where clause,if the applicationId does not exists in table applicationsHomeSubCluster ,it will create a gap lock。
At this moment ,if other threads want to insert new records of the gap,deaklock may happen.
To reproduce the situation , we use 3 sessions to insert different applicationId :application_1526528662010_001201,application_1526528662010_001202,application_1526528662010_001203
To fix this issue,we should use insert ignore into instead of insert into select
Attachments
Attachments
Issue Links
- links to