Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6511

casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.13.0
    • 0.13.0, 0.14.0
    • None
    • None

    Description

      select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled:

      4619756289662.078125	-1628520834	-16770	126
      1553532646710.316406	-1245514442	-2762	54
      3367942487288.360352	688127224	-776	-8
      4386447830839.337891	1286221623	12087	55
      -3234165331139.458008	-54957251	27453	61
      -488378613475.326172	1247658269	-16099	29
      -493942492598.691406	-21253559	-19895	73
      3101852523586.039062	886135874	23618	66
      2544105595941.381836	1484956709	-23515	37
      -3997512403067.0625	1102149509	30597	-123
      -1183754978977.589355	1655994718	31070	94
      1408783849655.676758	34576568	-26440	-72
      -2993175106993.426758	417098319	27215	79
      3004723551798.100586	-1753555402	-8650	54
      1103792083527.786133	-14511544	-28088	72
      469767055288.485352	1615620024	26552	-72
      -1263700791098.294434	-980406074	12486	-58
      -4244889766496.484375	-1462078048	30112	-96
      -3962729491139.782715	1525323068	-27332	60
      NULL	NULL	NULL	NULL
      

      When vectorization is disabled, result looks like this:

      4619756289662.078125	-1628520834	-16770	126
      1553532646710.316406	-1245514442	-2762	54
      3367942487288.360352	688127224	-776	-8
      4386447830839.337891	1286221623	12087	55
      -3234165331139.458008	-54957251	27453	61
      -488378613475.326172	1247658269	-16099	29
      -493942492598.691406	-21253558	-19894	74
      3101852523586.039062	886135874	23618	66
      2544105595941.381836	1484956709	-23515	37
      -3997512403067.0625	1102149509	30597	-123
      -1183754978977.589355	1655994719	31071	95
      1408783849655.676758	34576567	-26441	-73
      -2993175106993.426758	417098319	27215	79
      3004723551798.100586	-1753555402	-8650	54
      1103792083527.786133	-14511545	-28089	71
      469767055288.485352	1615620024	26552	-72
      -1263700791098.294434	-980406074	12486	-58
      -4244889766496.484375	-1462078048	30112	-96
      -3962729491139.782715	1525323069	-27331	61
      NULL	NULL	NULL	NULL
      

      This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results.

      vectortab10korc table schema:

      t                   	tinyint             	from deserializer   
      si                  	smallint            	from deserializer   
      i                   	int                 	from deserializer   
      b                   	bigint              	from deserializer   
      f                   	float               	from deserializer   
      d                   	double              	from deserializer   
      dc                  	decimal(38,18)      	from deserializer   
      bo                  	boolean             	from deserializer   
      s                   	string              	from deserializer   
      s2                  	string              	from deserializer   
      ts                  	timestamp           	from deserializer   
      	 	 
      # Detailed Table Information	 	 
      Database:           	default             	 
      Owner:              	xyz              	 
      CreateTime:         	Tue Feb 25 21:54:28 UTC 2014	 
      LastAccessTime:     	UNKNOWN             	 
      Protect Mode:       	None                	 
      Retention:          	0                   	 
      Location:           	hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc	 
      Table Type:         	MANAGED_TABLE       	 
      Table Parameters:	 	 
      	COLUMN_STATS_ACCURATE	true                
      	numFiles            	1                   
      	numRows             	10000               
      	rawDataSize         	0                   
      	totalSize           	344748              
      	transient_lastDdlTime	1393365281          
      	 	 
      # Storage Information	 	 
      SerDe Library:      	org.apache.hadoop.hive.ql.io.orc.OrcSerde	 
      InputFormat:        	org.apache.hadoop.hive.ql.io.orc.OrcInputFormat	 
      OutputFormat:       	org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat	 
      Compressed:         	No                  	 
      Num Buckets:        	-1                  	 
      Bucket Columns:     	[]                  	 
      Sort Columns:       	[]                  	 
      Storage Desc Params:	 	 
      	serialization.format	1                   
      Time taken: 0.196 seconds, Fetched: 41 row(s
      

      Attachments

        1. HIVE-6511.1.patch
          3 kB
          Jitendra Nath Pandey
        2. HIVE-6511.2.patch
          6 kB
          Jitendra Nath Pandey
        3. HIVE-6511.3.patch
          6 kB
          Jitendra Nath Pandey
        4. HIVE-6511.4.patch
          2 kB
          Jitendra Nath Pandey

        Issue Links

          Activity

            People

              jnp Jitendra Nath Pandey
              jnp Jitendra Nath Pandey
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: