Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.2.1
-
None
-
None
Description
create table acidTblPart (a int, b int) partitioned by (p string) clustered by (a) into " + BUCKET_COUNT + " buckets stored as orc TBLPROPERTIES ('transactional'='true')
update acidTblPart set b = 17 where p = 1
This acquires share_write on the table while based on p = 1 we should be able to figure out that only 1 partition is affected and only lock the partition
Same should apply to DELETE
Above is true when table is empty. If table has data, in particular it has p=1 partition, then only the partition is locked.
However "update acidTblPart set b = 17 where b = 18" and the table is not empty, will lock every partition separately.
For a table with 100K partitions this will be a performance issue.
Need to look into getting a table level lock instead or build general lock promotion logic.
The logic in SemanticAnalyzer seems to be to take all known partitions of a table being read and create ReadEntity objects for those that match the WHERE clause.
A ReadEntity for the table is also created but due to logic in UpdateDeleteSemanticAnalyzer we ignore it.
(We set setUpdateOrDelete() on it but remove the corresponding WriteEntity and replace it with WriteEntity for each partition)
Attachments
Issue Links
- is related to
-
HIVE-15032 Update/Delete statements use dynamic partitions when it's not necessary
- Open