Hive
  1. Hive
  2. HIVE-3006

Skip execution of queries with always false WHERE clauses

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Query Processor
    • Labels:
      None

      Activity

      Hide
      Carl Steinbach added a comment -

      It's not a partitioned table:

      hive> DESCRIBE FORMATTED src;
      DESCRIBE FORMATTED src;
      OK
      # col_name            	data_type           	comment             
      	 	 
      key                 	string              	None                
      value               	string              	None                
      	 	 
      # Detailed Table Information	 	 
      Database:           	default             	 
      Owner:              	carl                	 
      CreateTime:         	Sat May 05 18:19:09 PDT 2012	 
      LastAccessTime:     	UNKNOWN             	 
      Protect Mode:       	None                	 
      Retention:          	0                   	 
      Location:           	file:/user/hive/warehouse/src	 
      Table Type:         	MANAGED_TABLE       	 
      Table Parameters:	 	 
      	transient_lastDdlTime	1336267149          
      	 	 
      # Storage Information	 	 
      SerDe Library:      	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe	 
      InputFormat:        	org.apache.hadoop.mapred.TextInputFormat	 
      OutputFormat:       	org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat	 
      Compressed:         	No                  	 
      Num Buckets:        	-1                  	 
      Bucket Columns:     	[]                  	 
      Sort Columns:       	[]                  	 
      Storage Desc Params:	 	 
      	serialization.format	1                   
      Time taken: 0.346 seconds
      
      Show
      Carl Steinbach added a comment - It's not a partitioned table: hive> DESCRIBE FORMATTED src; DESCRIBE FORMATTED src; OK # col_name data_type comment key string None value string None # Detailed Table Information Database: default Owner: carl CreateTime: Sat May 05 18:19:09 PDT 2012 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: file:/user/hive/warehouse/src Table Type: MANAGED_TABLE Table Parameters: transient_lastDdlTime 1336267149 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format 1 Time taken: 0.346 seconds
      Hide
      Namit Jain added a comment -

      Is src a partitioned table ? In that case, I think, all the partitions would be pruned, thereby getting the right result.

      But, I agree, this should also work for non-partitioned tables.

      Show
      Namit Jain added a comment - Is src a partitioned table ? In that case, I think, all the partitions would be pruned, thereby getting the right result. But, I agree, this should also work for non-partitioned tables.
      Hide
      Carl Steinbach added a comment -

      Some clients (e.g. ODBC clients) execute queries with constant WHERE clauses that are always false in order to generate a result set without incurring the expense of actually executing the query. Unfortunately, this technique does not work with Hive:

      hive> SELECT key FROM src WHERE false;
      SELECT key FROM src WHERE false;
      Total MapReduce jobs = 1
      Launching Job 1 out of 1
      Number of reduce tasks is set to 0 since there's no reduce operator
      Execution log at: /tmp/carl/carl_20120505183939_4008b766-5637-4c5d-8cb0-22fb222ee228.log
      Job running in-process (local Hadoop)
      Hadoop job information for null: number of mappers: 0; number of reducers: 0
      2012-05-05 18:39:15,923 null map = 100%,  reduce = 0%
      Ended Job = job_local_0001
      Execution completed successfully
      Mapred Local Task Succeeded . Convert the Join into MapJoin
      OK
      Time taken: 3.664 seconds
      
      hive> SELECT key FROM src WHERE 1=0;
      SELECT key FROM src WHERE 1=0;
      Total MapReduce jobs = 1
      Launching Job 1 out of 1
      Number of reduce tasks is set to 0 since there's no reduce operator
      Execution log at: /tmp/carl/carl_20120505184040_074bd38d-2697-40e2-996a-91a46aaad71b.log
      Job running in-process (local Hadoop)
      Hadoop job information for null: number of mappers: 0; number of reducers: 0
      2012-05-05 18:40:11,407 null map = 100%,  reduce = 0%
      Ended Job = job_local_0001
      Execution completed successfully
      Mapred Local Task Succeeded . Convert the Join into MapJoin
      OK
      Time taken: 3.615 seconds
      
      Show
      Carl Steinbach added a comment - Some clients (e.g. ODBC clients) execute queries with constant WHERE clauses that are always false in order to generate a result set without incurring the expense of actually executing the query. Unfortunately, this technique does not work with Hive: hive> SELECT key FROM src WHERE false; SELECT key FROM src WHERE false; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Execution log at: /tmp/carl/carl_20120505183939_4008b766-5637-4c5d-8cb0-22fb222ee228.log Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 0; number of reducers: 0 2012-05-05 18:39:15,923 null map = 100%, reduce = 0% Ended Job = job_local_0001 Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin OK Time taken: 3.664 seconds hive> SELECT key FROM src WHERE 1=0; SELECT key FROM src WHERE 1=0; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Execution log at: /tmp/carl/carl_20120505184040_074bd38d-2697-40e2-996a-91a46aaad71b.log Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 0; number of reducers: 0 2012-05-05 18:40:11,407 null map = 100%, reduce = 0% Ended Job = job_local_0001 Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin OK Time taken: 3.615 seconds

        People

        • Assignee:
          Shreepadma Venugopalan
          Reporter:
          Carl Steinbach
        • Votes:
          0 Vote for this issue
          Watchers:
          2 Start watching this issue

          Dates

          • Created:
            Updated:

            Development