Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
-
ghx-label-8
Description
One of the problems with CREATE/DROP events is that they may occur while a batch is being processed and hence EventsProcessor may not able aware of that.
For example, consider the following sequence of statements:
create table foo (c1 int);
drop table foo;
create table foo (c2 int);
drop table foo;
These statements will generate CREATE_TABLE, DROP_TABLE, CREATE_TABLE, DROP_TABLE event sequence. Generally, if all these 4 events are fetched in a batch, then the first CREATE_TABLE and third CREATE_TABLE is ignored because it is followed by the a DROP_TABLE in the sequence and the DROP_TABLE events take no effect since the table doesn't exist in catalogd anymore.
However, if the events processor fetches these events in 2 batches (3 and 1) then after the first batch of CREATE_TABLE, DROP_TABLE, CREATE_TABLE is processed, the third event will add the table foo in the catalogd. The subsequent batch's DROP_TABLE will be processed and remove the table, but between the two batches, catalogd will say that a table called foo exists. This can lead to statements getting errored out. Eg. a statement like create table foo (c3 int) after the above statements will error out with a TableAlreadyExists error.
The problem happens for databases too. So far I have not been able to reproduce this for Partitions but I don't see why it will not happen with Partitions also.
Attachments
Issue Links
- duplicates
-
IMPALA-10502 delayed 'Invalidated objects in cache' cause 'Table already exists'
- Resolved
- is caused by
-
IMPALA-7954 Support automatic invalidates using metastore notification events
- Resolved