Description
In sentry unit tests, we don't create schema before running a test. Instead, we use dataNucleus to create sentry tables when they are accessed. This creates potential deadlock when running test for Sentry HA setup.
For example, the following shows the event sequence that causes deadlock
1) thread_1 gets shared lock of SYSTABLES in order to read table SENTRY_HMS_NOTIFICATION_ID
2) thread_2 gets shared lock of SYSTABLES in order to read table SENTRY_HMS_NOTIFICATION_ID
3) thread_1 tries to get execution lock to create table SENTRY_HMS_NOTIFICATION_ID,
and wait for execution lock because thread_2 got shared lock already.
4) thread_2 tries to get execution lock to create table SENTRY_HMS_NOTIFICATION_ID,
and wait for execution lock because thread_1 got shared lock already.
The solution is to let the instances of sentry service start with delay. Specifically,
let HMS follower threads separate as far as possible, i.e., half of the interval.
This deadlock only exists in unit tests, and does not exist in production because schema is created before starting Sentry services. Therefore, there is no table creation after service starts.
Example of the log message when such deadlock happens
2019-01-04 18:32:46,332 (pool-13-thread-1) [INFO - org.apache.sentry.hdfs.DBUpdateForwarder.getAllUpdatesFrom(DBUpdateForwarder.java:140)] (org.apache.sentry.hdfs.PermImageRetriever) Requested sequence number 0 is less than 0 or requested deltas for that sequence number are not available. Fetch a full update
2019-01-04 18:32:49,346 (hms-follower) [DEBUG - org.apache.sentry.service.thrift.SentryStateBank.enableState(SentryStateBank.java:102)] HMSFollower entered state STARTED
2019-01-04 18:32:50,091 (hms-follower) [DEBUG - org.apache.sentry.service.thrift.SentryStateBank.enableState(SentryStateBank.java:102)] HMSFollower entered state STARTED
2019-01-04 18:33:00,286 (store-cleaner) [ERROR - org.datanucleus.util.Log4JLogger.error(Log4JLogger.java:115)] Error thrown executing CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
(
NOTIFICATION_ID BIGINT NOT NULL
) : A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
Lock : ROW, SYSTABLES, (1,3)
Waiting XID : {436, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
(
NOTIFICATION_ID BIGINT NOT NULL
)
Granted XID : {419, S}
Lock : ROW, SYSTABLES, (1,3)
Waiting XID : {419, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
(
NOTIFICATION_ID BIGINT NOT NULL
)
Granted XID : {419, S} , {436, S}
. The selected victim is XID : 436.
java.sql.SQLTransactionRollbackException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
Lock : ROW, SYSTABLES, (1,3)
Waiting XID : {436, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
(
NOTIFICATION_ID BIGINT NOT NULL
)
Granted XID : {419, S}
Lock : ROW, SYSTABLES, (1,3)
Waiting XID : {419, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
(
NOTIFICATION_ID BIGINT NOT NULL
)
Granted XID : {419, S} , {436, S}
. The selected victim is XID : 436.
at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:879)
at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatementList(AbstractTable.java:830)
at org.datanucleus.store.rdbms.table.AbstractTable.create(AbstractTable.java:546)
at org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.java:609)
at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3385)
at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2896)
at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:119)
at
...[truncated 36339941 chars]...
eImpl(RetryClientInvocationHandler.java:94)] Calling getAllUpdatesFrom
Attachments
Attachments
Issue Links
- links to