Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-23871

ObjectStore should properly handle MicroManaged Table properties

    XMLWordPrintableJSON

Details

    Description

      HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
      However, it does that for all Transactional Tables – not only ACID – causing MicroManaged Tables to behave abnormally.
      MicroManaged (insert_only) tables may miss needed properties such as Storage Desc Params – that may define how lines are delimited (like in the example below):

      To repro the issue:

      CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
      LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
      describe formatted delim_table_trans;
      SELECT * FROM delim_table_trans;
      

      Result:

      Table Type:         	MANAGED_TABLE       	 
      Table Parameters:	 	 
      	bucketing_version   	2                   
      	numFiles            	1                   
      	numRows             	0                   
      	rawDataSize         	0                   
      	totalSize           	72                  
      	transactional       	true                
      	transactional_properties	insert_only         
      #### A masked pattern was here ####
      	 	 
      # Storage Information	 	 
      SerDe Library:      	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe	 
      InputFormat:        	org.apache.hadoop.mapred.TextInputFormat	 
      OutputFormat:       	org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat	 
      Compressed:         	No                  	 
      Num Buckets:        	-1                  	 
      Bucket Columns:     	[]                  	 
      Sort Columns:       	[]                  	 
      PREHOOK: query: SELECT * FROM delim_table_trans
      PREHOOK: type: QUERY
      PREHOOK: Input: default@delim_table_trans
      #### A masked pattern was here ####
      POSTHOOK: query: SELECT * FROM delim_table_trans
      POSTHOOK: type: QUERY
      POSTHOOK: Input: default@delim_table_trans
      #### A masked pattern was here ####
      NULL	NULL	NULL
      NULL	NULL	NULL
      NULL	NULL	NULL
      NULL	NULL	NULL
      NULL	NULL	NULL
      NULL	NULL	NULL
       

      Attachments

        1. table1
          0.1 kB
          Panagiotis Garefalakis

        Issue Links

          Activity

            People

              pgaref Panagiotis Garefalakis
              pgaref Panagiotis Garefalakis
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m