Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10757

ACID table locking for DML statements is faulty

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Frontend
    • None
    • ghx-label-6

    Description

      Plain SELECT queries don't take ACID locks. They use the latest snapshot of the table that is loaded by CatalogD.

      However, DML statements lock all the tables it references, not just the target table.

      E.g.:

      INSERT INTO target_table SELECT * FROM source_table;
      

      acquires locks for both target_table and source_table. However, after acquiring the locks Impala doesn't reload the tables.

      Therefore the following situation is possible:

      INSERT OVERWRITE foo SELECT ...; (takes an exclusive lock for foo)
      

      while the following statement also tries to take a SHARED_LOCK for foo:

      INSERT INTO bar SELECT * FROM foo;
      

      It means the INSERT INTO statement might wait for the completion of the INSERT OVERWRITE statement, but since it doesn't reload foo it will still use the old snapshot of foo, hence there was no benefit of waiting for the lock.

      Possible solutions:

      1. Re-load tables after the lock is acquired
      2. Only take lock for the target table. This would be better than the current behavior, also it would be consistent with plain SELECT queries.

      I think reloading should be favored as Impala should run every statement (that involves ACID tables) in a transaction and take proper locks, see IMPALA-8788.

      Attachments

        Activity

          People

            Unassigned Unassigned
            boroknagyz Zoltán Borók-Nagy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: