Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
ghx-label-7
Description
Currently Impala tries to batch events like partition insert/creation only if:
1. the next event is for the same table as the previous one
2. the next event's id is the previous one's + 1
3. the next event has the same type as the previous one
(2 can be stricter than 1 if some events were filtered between the two)
Another limit is that only events in the same batch from HMS can be merged. Currently 1000 events are polled at the same time: https://github.com/apache/impala/blob/94f4f1d82461d8f71fbd0d2e9082aa29b5f53a89/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L218
Making this configurable could be also useful.
Event batching could be improved by batching all events to the current one if they modify the same table, unless they are "cut" by:
a. an event on the same table but with a different type
b. a rename table event where the original or the new name is the same as the current event
If such an event occurs, the events after that can be only merged to a newer event.
Attachments
Attachments
Issue Links
- relates to
-
HIVE-27746 Hive Metastore should send single AlterPartitionEvent with list of partitions
- Resolved