Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.11.0, 0.12.0, 0.13.0, 0.14.0
-
None
-
None
Description
list bucketing feature maintains a map of skewed columns/values to location in metastore. This map is not getting updated for unpartitioned tables. For partitioned tables the location map gets updated properly. To reproduce the issue
hive>set hive.mapred.supports.subdirectories=true; hive>set mapred.input.dir.recursive=true; hive>create table t(col1 string, col2 string); hive>load data local inpath '/home/hadoop/a.txt' into table t; hive> select * from t; OK 1 a 2 b 3 c 4 a 5 b 6 a hive>create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as directories; hive>insert into table t1 select * from t; hive>desc extended t1; OK r1 string r2 string Detailed Table Information Table(tableName:t1, dbName:default, owner:pjayachandran, createTime:1398295903, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:r1, type:string, comment:null), FieldSchema(name:r2, type:string, comment:null)], location:file:/app/warehouse/t1, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[r2], skewedColValues:[[a]], skewedColValueLocationMaps:{}), storedAsSubDirectories:true), partitionKeys:[], parameters:{numFiles=6, COLUMN_STATS_ACCURATE=true, transient_lastDdlTime=1398297887, numRows=6, totalSize=72, rawDataSize=18}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE) Time taken: 0.119 seconds, Fetched: 4 row(s)
as seen from describe output skewedColValueLocationMaps is empty
Attachments
Attachments
Issue Links
- links to