An issue was identified by ekoifman in the streaming mutation API (
HIVE-10165) where an insufficiently restrictive lock was being used when issuing updates and deletes to ACID tables and partitions. A shared lock was being used where in fact a semi-shared lock is required. Additionally, the current lock scope targets the entire table, whereas in theory if the table is partitioned, then only the affected partitions are required to participate in the semi-shared lock. However, there are a couple of technical challenges that prevent the locks currently being applied on a per-partition basis:
- It is expected that the affected partitions are not known in advance so individual partition locks would need to be acquired as needed.
- The API is expected to execute in a clustered environment and so acquiring these locks as on an ‘as needed’ basis presents a risk that the meta store may become overwhelmed. This is expected to be less of an problem when an HBase based meta store is introduced (HIVE-9452).
- My understanding is that multiple fine grained lock acquisitions for a single transaction are not possible at present. When they are available they’ll introduce the possibility that deadlocks can occur. This should be better handled when HIVE-9675 is complete.
Therefore, as advised, at this time the system will obtain a semi-shared lock on participating tables. Although this will prevent other concurrent writes, it will preserve snapshot isolation when reading.