Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.8.0, 0.8.1, 0.9.0, 0.9.1, 0.10.0
-
None
-
None
-
java version "1.6.0_30"
hive version 0.9.0
hadoop version 0.20.205.0
-
bucketing, buckets, insert into
Description
If table my_table is bucketed, the command "insert into table my_table ..." is supposed to give an error stating "Bucketized tables do not support INSERT INTO".
However, it doesn't seem to do that in all cases.
Consider the following example on Hive 0.9.0:
create table src(x string) clustered by( x ) sorted by ( x ) into 32 buckets;
create table dest(x string) clustered by( x ) sorted by ( x ) into 32 buckets;
Now, put some data into x (after enable hive.enforce.bucketing and hive.enforce.sorting to be true).
Then, do:
insert into table dest select * from src;
This should fail since dest is a bucketized table. However, this succeeds creating a 33rd file inside the HDFS folder for the table, thereby corrupting it.
This happens regardless of whether the src table is bucketed or not.
Attachments
Issue Links
- relates to
-
HIVE-3244 Add table property which constraints sorting/bucketing for data loading
- Patch Available