Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-13212

locking too coarse/broad for update/delete on a pratition

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.2.1
    • None
    • Transactions
    • None

    Description

      create table acidTblPart (a int, b int) partitioned by (p string) clustered by (a) into " + BUCKET_COUNT + " buckets stored as orc TBLPROPERTIES ('transactional'='true')

      update acidTblPart set b = 17 where p = 1

      This acquires share_write on the table while based on p = 1 we should be able to figure out that only 1 partition is affected and only lock the partition

      Same should apply to DELETE

      Above is true when table is empty. If table has data, in particular it has p=1 partition, then only the partition is locked.

      However "update acidTblPart set b = 17 where b = 18" and the table is not empty, will lock every partition separately.
      For a table with 100K partitions this will be a performance issue.
      Need to look into getting a table level lock instead or build general lock promotion logic.

      The logic in SemanticAnalyzer seems to be to take all known partitions of a table being read and create ReadEntity objects for those that match the WHERE clause.
      A ReadEntity for the table is also created but due to logic in UpdateDeleteSemanticAnalyzer we ignore it.
      (We set setUpdateOrDelete() on it but remove the corresponding WriteEntity and replace it with WriteEntity for each partition)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: