Description
The problem was discovered by vamsee when he cleared up HMS notifications manually.
Sentry goes in a loop:
2017-09-06 16:56:59,923 INFO hive.metastore: Closed a connection to metastore, current connections: 10 2017-09-06 16:56:59,932 INFO hive.metastore: Closed a connection to metastore, current connections: 9 2017-09-06 16:56:59,938 INFO hive.metastore: Closed a connection to metastore, current connections: 8 2017-09-06 16:56:59,946 INFO hive.metastore: Closed a connection to metastore, current connections: 7 2017-09-06 16:56:59,949 INFO hive.metastore: Closed a connection to metastore, current connections: 6 2017-09-06 16:56:59,949 INFO hive.metastore: Closed a connection to metastore, current connections: 5 2017-09-06 16:56:59,952 INFO hive.metastore: Closed a connection to metastore, current connections: 4 2017-09-06 16:56:59,952 INFO hive.metastore: Closed a connection to metastore, current connections: 3 2017-09-06 16:56:59,955 INFO hive.metastore: Closed a connection to metastore, current connections: 2 2017-09-06 16:56:59,955 INFO org.apache.sentry.service.thrift.SentryHMSClient: Obtained full HMS snapshot 2017-09-06 16:56:59,959 INFO org.apache.sentry.service.thrift.SentryHMSClient: NotificationID, Before Snapshot: 263, After Snapshot 263 2017-09-06 16:56:59,959 INFO org.apache.sentry.service.thrift.SentryHMSClient: Successfully fetched hive full snapshot, Current NotificationID: CurrentNotificationEventId(eventId:263). 2017-09-06 16:56:59,963 ERROR org.apache.sentry.service.thrift.CounterWait: new counter value 263 is smaller then the previous one 9025 2017-09-06 16:56:59,963 INFO org.apache.sentry.service.thrift.HMSFollower: Sentry HMS support is ready 2017-09-06 16:56:59,973 INFO org.apache.sentry.service.thrift.HMSFollower: The latest notification ID on HMS is less than the latest notification ID processed by Sentry. Need to request a full HMS snapshot. 2017-09-06 16:56:59,975 INFO org.apache.sentry.service.thrift.SentryHMSClient: Request full HMS snapshot 2017-09-06 16:56:59,975 INFO hive.metastore: Trying to connect to metastore with URI thrift://foo.com:9083 2017-09-06 16:56:59,981 INFO hive.metastore: Opened a connection to metastore, current connections: 3 2017-09-06 16:56:59,981 INFO hive.metastore: Connected to metastore. 2017-09-06 16:56:59,986 INFO hive.metastore: Closed a connection to metastore, current connections: 2 2017-09-06 16:56:59,986 INFO hive.metastore: Trying to connect to metastore with URI thrift://foo.com:9083 2017-09-06 16:56:59,986 INFO hive.metastore: Trying to connect to metastore with URI thrift://foo.com:9083 2017-09-06 16:56:59,987 INFO hive.metastore: Trying to connect to metastore with URI thrift://foo.com:9083
The problem is that when we detect some weirdness and read the new full snapshot there may be larger NotificationID in the DB which will cause us to get a new snapshot again.
As part of saving the full snapshot to DB we should remove all notifications with ID higher then the one in the snapshot.
Attachments
Attachments
Issue Links
- links to