Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 1.2
-
None
-
None
Description
Impala does not use a partition's HDFS location as the sink location for INSERT queries, instead it uses the parent table's HDFS base directory and then builds the partition keys from that.
This means data will not get inserted into the expected location if you do something like:
CREATE TABLE Foo(i int) PARTITION(j int) LOCATION '/test-warehouse/foo' ALTER TABLE Foo ADD PARTITION(j=1); ... ALTER TABLE Foo PARTITION(j=1) SET LOCATION '/test-warehouse/another_path/j=1'; INSERT INTO Foo PARTITION(j=1) SELECT 1; <-- this will go to /test-warehouse/foo/j=1 instead of the new path
When scanning the table, it seems we do use the correct path so the insert will not be reflect if the user tries to query the table.
This is because we don't pass all the partition paths to the BE when executing the insert, we just set the HDFS base directory and then the partition expressions.
Attachments
Issue Links
- relates to
-
IMPALA-741 Impala forgets about partitions with non-existant locations
- Resolved