Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
4.0.0
Description
Hive DDL's are intermittently failing with Error- Couldn't acquire the DB log notification lock because we reached the maximum # of retries: 5 retries
Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Couldn't acquire the DB log notification lock because we reached the maximum # of retries: 5 retries. If this happens too often, then is recommended to increase the maximum number of retries on the hive.notification.sequence.lock.max.retries configuration :: Error executing SQL query "select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update".) 2018-08-28 01:17:56,808|INFO|MainThread|machine.py:183 - run()||GUID=94e6ff4d-5db8-45eb-8654-76f546e7f0b3|java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Couldn't acquire the DB log notification lock because we reached the maximum # of retries: 5 retries. If this happens too often, then is recommended to increase the maximum number of retries on the hive.notification.sequence.lock.max.retries configuration :: Error executing SQL query "select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update".)
It seems, metastore operations are slow in this cluster and hence concurrent writes/DDL operations are failing to lock the row for update.
Currently, the sleep interval between retries is specified via the config hive.notification.sequence.lock.retry.sleep.interval. The default value is 500 ms which seems to be too small. Can we set higher values for sleep interval and retries count,
hive.notification.sequence.lock.retry.sleep.interval=10s
hive.notification.sequence.lock.max.retries=10