Description
With recent HEAD, unable to create or killall job. It always complain following error:
aurora create cp0/bhuvan/staging10/hello hello_world.aurora [stories/apps-in-docker] 15:35:57 INFO] Creating job hello INFO] Starting new HTTP connection (1): a005832.vp.iso.apple.com INFO] Starting new HTTP connection (1): a005832.vp.iso.apple.com INFO] Response from scheduler: LOCK_ERROR (message: Unable to perform operation for: bhuvan/staging10/hello. Use override/cancel option.) INFO] Note: if the scheduler detects that a job update is in progress (or was not properly completed) it will reject subsequent updates. This is because your job is likely in a partially-updated state. You should only begin another update if you are confident that nobody is updating this job, and that the job is in a state suitable for an update. After checking on the above, you may release the update lock on the job by invoking cancel_update.
The scheduler log, when run in FINE log level, show that one lock is held. The lock is held by completely different task. Confirmed it by querying /locks endpoint. This is the commit, where lockMapper is changed to use LEFT OUTER JOIN.
https://github.com/apache/incubator-aurora/commit/5cf760bf31315c220c0f17cc233ad3a1dcfb6d86
D0806 22:37:34.903 THREAD1754 org.apache.ibatis.logging.jdbc.BaseJdbcLogger.debug: ==> Preparing: SELECT * FROM locks LEFT OUTER JOIN job_keys AS key ON key.role = ? AND key.environment = ? AND key.name = ? AND key.id = job_key_id D0806 22:37:34.903 THREAD1754 org.apache.ibatis.logging.jdbc.BaseJdbcLogger.debug: ==> Parameters: bhuvan(String), staging10(String), hello(String) D0806 22:37:34.904 THREAD1754 org.apache.ibatis.logging.jdbc.BaseJdbcLogger.debug: <== Total: 1