[SENTRY-1813] LeaderStatusMonitor could get into limbo state upon ZK connection loss - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 2.0.0
Fix Version/s: 2.0.0
Component/s: None
Labels:
None

Description

I noticed that during failover testing, if there was a connection loss with ZK to the sentry servers, the one who is currently the leader gets into a limbo state as it interrupts the Curator-LeaderSelector thread which no longer gets revived in the running Sentry process (unless the process is restarted).

Relevant code under LeaderStatusMonitor
http://github.mtv.cloudera.com/CDH/sentry/blob/cdh5-1.5.1/sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/LeaderStatusMonitor.java#L243-L246

   try {
      isLeader = true;
      // Wait until we are interrupted or receive a signal
      cond.await();
    } catch (InterruptedException ignored) {
      Thread.currentThread().interrupt();
      LOG.info("LeaderStatusMonitor: interrupted");
    } finally {
      isLeader = false;
      lock.unlock();
      LOG.info("LeaderStatusMonitor: becoming standby");
    }

I realized even upon the loss of ZK connection, curator framework raises an Interrupted Exception in LeaderStausMonitor which attempts to call interrupt on Thread.currentThread which is essentially Curator-LeaderSelector thread.
<SCREENSHOT_ATTACHED>

So if the LeaderSelector thread is interrupted, this particular Sentry server loses the capability of participating in a leader election in the future. And if this happens to all the sentry servers in the cluster, any further loss could get into a limbo state.

And during this state, Sentry no longer reads events from HMS and thereby users can no longer be able to issue DDL statements like CREATE etc. However GRANT, REVOKE still work as they don't go through HMSFollower.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

Screenshot.png
23/Jun/17 07:12
355 kB
Vamsee K. Yarlagadda

Issue Links

duplicates

SENTRY-2079 Sentry HA leader monitor does not work due to a mix of curator versions in the classpath

Resolved

Activity

People

Assignee:: Vamsee K. Yarlagadda

Reporter:: Vamsee K. Yarlagadda

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 23/Jun/17 07:12

Updated:: 01/Dec/17 16:33

Resolved:: 01/Dec/17 16:33