Uploaded image for project: 'Sentry (Retired)'
  1. Sentry (Retired)
  2. SENTRY-2491

Sentry High availability unit tests run into deadlock sometimes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • Sentry
    • None

    Description

      In sentry unit tests, we don't create schema before running a test. Instead, we use dataNucleus to create sentry tables when they are accessed. This creates potential deadlock when running test for Sentry HA setup.

      For example, the following shows the event sequence that causes deadlock
      1) thread_1 gets shared lock of SYSTABLES in order to read table SENTRY_HMS_NOTIFICATION_ID
      2) thread_2 gets shared lock of SYSTABLES in order to read table SENTRY_HMS_NOTIFICATION_ID
      3) thread_1 tries to get execution lock to create table SENTRY_HMS_NOTIFICATION_ID,
      and wait for execution lock because thread_2 got shared lock already.
      4) thread_2 tries to get execution lock to create table SENTRY_HMS_NOTIFICATION_ID,
      and wait for execution lock because thread_1 got shared lock already.

      The solution is to let the instances of sentry service start with delay. Specifically,
      let HMS follower threads separate as far as possible, i.e., half of the interval.

      This deadlock only exists in unit tests, and does not exist in production because schema is created before starting Sentry services. Therefore, there is no table creation after service starts.

      Example of the log message when such deadlock happens

      2019-01-04 18:32:46,332 (pool-13-thread-1) [INFO - org.apache.sentry.hdfs.DBUpdateForwarder.getAllUpdatesFrom(DBUpdateForwarder.java:140)] (org.apache.sentry.hdfs.PermImageRetriever) Requested sequence number 0 is less than 0 or requested deltas for that sequence number are not available. Fetch a full update
      2019-01-04 18:32:49,346 (hms-follower) [DEBUG - org.apache.sentry.service.thrift.SentryStateBank.enableState(SentryStateBank.java:102)] HMSFollower entered state STARTED
      2019-01-04 18:32:50,091 (hms-follower) [DEBUG - org.apache.sentry.service.thrift.SentryStateBank.enableState(SentryStateBank.java:102)] HMSFollower entered state STARTED
      2019-01-04 18:33:00,286 (store-cleaner) [ERROR - org.datanucleus.util.Log4JLogger.error(Log4JLogger.java:115)] Error thrown executing CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
      (
          NOTIFICATION_ID BIGINT NOT NULL
      ) : A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
      Lock : ROW, SYSTABLES, (1,3)
        Waiting XID : {436, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
      (
          NOTIFICATION_ID BIGINT NOT NULL
      )
        Granted XID : {419, S} 
      Lock : ROW, SYSTABLES, (1,3)
        Waiting XID : {419, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
      (
          NOTIFICATION_ID BIGINT NOT NULL
      )
        Granted XID : {419, S} , {436, S} 
      . The selected victim is XID : 436.
      java.sql.SQLTransactionRollbackException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
      Lock : ROW, SYSTABLES, (1,3)
        Waiting XID : {436, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
      (
          NOTIFICATION_ID BIGINT NOT NULL
      )
        Granted XID : {419, S} 
      Lock : ROW, SYSTABLES, (1,3)
        Waiting XID : {419, X} , SENTRY, CREATE TABLE SENTRY_HMS_NOTIFICATION_ID
      (
          NOTIFICATION_ID BIGINT NOT NULL
      )
        Granted XID : {419, S} , {436, S} 
      . The selected victim is XID : 436.
      	at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
      	at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
      	at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
      	at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
      	at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
      	at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
      	at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
      	at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
      	at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
      	at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
      	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:879)
      	at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatementList(AbstractTable.java:830)
      	at org.datanucleus.store.rdbms.table.AbstractTable.create(AbstractTable.java:546)
      	at org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.java:609)
      	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3385)
      	at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2896)
      	at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:119)
      	at 
      ...[truncated 36339941 chars]...
      eImpl(RetryClientInvocationHandler.java:94)] Calling getAllUpdatesFrom
      

      Attachments

        1. SENTRY-2491.003.patch
          2 kB
          Na Li
        2. SENTRY-2491.002.patch
          2 kB
          Na Li
        3. SENTRY-2491.002.patch
          2 kB
          Na Li
        4. SENTRY-2491.002.patch
          2 kB
          Na Li
        5. SENTRY-2491.002.patch
          2 kB
          Na Li
        6. SENTRY-2491.001.patch
          2 kB
          Na Li

        Issue Links

          Activity

            People

              linaataustin Na Li
              linaataustin Na Li
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: