Hive
  1. Hive
  2. HIVE-6968

list bucketing feature does not update the location map for unpartitioned tables

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.11.0, 0.12.0, 0.13.0, 0.14.0
    • Fix Version/s: 0.14.0
    • Component/s: None
    • Labels:
      None

      Description

      list bucketing feature maintains a map of skewed columns/values to location in metastore. This map is not getting updated for unpartitioned tables. For partitioned tables the location map gets updated properly. To reproduce the issue

      hive>set hive.mapred.supports.subdirectories=true;
      hive>set mapred.input.dir.recursive=true;
      
      hive>create table t(col1 string, col2 string);
      hive>load  data local inpath '/home/hadoop/a.txt' into table t; 
      hive> select * from t;                                                                   
      OK
      1	a
      2	b
      3	c
      4	a
      5	b
      6	a
      
      hive>create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as directories;
      hive>insert into table t1 select * from t;
      hive>desc extended t1;
      OK
      r1                  	string              	                    
      r2                  	string              	                    
      	 	 
      Detailed Table Information	Table(tableName:t1, dbName:default, owner:pjayachandran, createTime:1398295903, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:r1, type:string, comment:null), FieldSchema(name:r2, type:string, comment:null)], location:file:/app/warehouse/t1, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[r2], skewedColValues:[[a]], skewedColValueLocationMaps:{}), storedAsSubDirectories:true), partitionKeys:[], parameters:{numFiles=6, COLUMN_STATS_ACCURATE=true, transient_lastDdlTime=1398297887, numRows=6, totalSize=72, rawDataSize=18}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)	
      Time taken: 0.119 seconds, Fetched: 4 row(s)
      

      as seen from describe output skewedColValueLocationMaps is empty

      1. HIVE-6968.1.patch
        24 kB
        Prasanth Jayachandran
      2. HIVE-6968.2.patch
        24 kB
        Prasanth Jayachandran

        Issue Links

          Activity

          Thejas M Nair made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Prasanth Jayachandran made changes -
          Fix Version/s 0.14.0 [ 12326450 ]
          Prasanth Jayachandran made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Prasanth Jayachandran made changes -
          Attachment HIVE-6968.2.patch [ 12642379 ]
          Prasanth Jayachandran made changes -
          Remote Link This issue links to "Review Board (Web Link)" [ 14922 ]
          Prasanth Jayachandran made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Prasanth Jayachandran made changes -
          Field Original Value New Value
          Attachment HIVE-6968.1.patch [ 12641642 ]
          Prasanth Jayachandran created issue -

            People

            • Assignee:
              Prasanth Jayachandran
              Reporter:
              Prasanth Jayachandran
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development