Details
-
Improvement
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.5.0, Impala 2.6.0, Impala 2.7.0
Description
Currently, long running DDL/DML operations can block other operations from making progress if they run concurrently with the getCatalogObjects() call that creates catalog updates. The reason is that while getCatalogObjects() holds the lock for its entire duration and also tries to acquire the locks for the tables it processes. If that operation is blocked by another operation on a table then any other, unrelated, catalog write operation cannot make any progress as it cannot acquire the catalog lock which is held by getCatalogObjects().
From a user's point of view, concurrent DDL/DML operations are executed serially and, consequently, the latency of DDL/DML operations may vary significantly. With the fix for this issue concurrent DDL/DML operations should allow to run concurrently and the throughput of these operations should increase significantly. At the same time, the latency of DDL/DML operations should not depend on any other operations that are running at the same time. It's important to note that when we talk about the latency of an operation it is with respect to the coordinator that initiates the operation; the fix doesn't do anything to improve the latency of broadcasting metadata changes through the statestore. Some common user case where this fix is applicable are the following:
- Concurrent REFRESH operations on different tables.
- Concurrent ALTER TABLE operations on different tables.
Attachments
Attachments
Issue Links
- breaks
-
IMPALA-6486 INVALIDATE METADATA may hang after statestore restart
- Resolved
-
IMPALA-6948 Coordinators don't detect the deletion of tables that occurred outside of impala after catalog restart
- Resolved
-
IMPALA-6671 Metadata operations that modify a table blocks topic updates for other unrelated operations
- Resolved
-
IMPALA-7961 Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail fast
- Resolved
- duplicates
-
IMPALA-4799 Long running metadata load for large tables blocks queries/loading all other tables for long time
- Resolved
- is part of
-
IMPALA-5299 Improve catalog scalability and large catalog handling
- Open