Currently, long running DDL/DML operations can block other operations from making progress if they run concurrently with the getCatalogObjects() call that creates catalog updates. The reason is that while getCatalogObjects() holds the lock for its entire duration and also tries to acquire the locks for the tables it processes. If that operation is blocked by another operation on a table then any other, unrelated, catalog write operation cannot make any progress as it cannot acquire the catalog lock which is held by getCatalogObjects().
From a user's point of view, concurrent DDL/DML operations are executed serially and, consequently, the latency of DDL/DML operations may vary significantly. With the fix for this issue concurrent DDL/DML operations should allow to run concurrently and the throughput of these operations should increase significantly. At the same time, the latency of DDL/DML operations should not depend on any other operations that are running at the same time. It's important to note that when we talk about the latency of an operation it is with respect to the coordinator that initiates the operation; the fix doesn't do anything to improve the latency of broadcasting metadata changes through the statestore. Some common user case where this fix is applicable are the following:
- Concurrent REFRESH operations on different tables.
- Concurrent ALTER TABLE operations on different tables.