Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-2124

Specified functions in the partitioning predicates should not generate a M/R job.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 0.5.0, 0.6.0, 0.7.0
    • None
    • Query Processor

    Description

      For certain situations specifying which functions should be evaluated once would help to make syntax simpler to avoid launching M/R jobs.

      Example:

      # myhql.time=`date "+%s"` -> constant
      # counting rows from the last 30 days generates a M/R job using all the partitions
      $ hive -hiveconf myhql.time=`date "+%s"` -e "SELECT COUNT(*) FROM mybigtable WHERE mypartition >= from_unixtime(\${hiveconf:myhql.time}-2592000,'yyyy-MM-dd');

      Suggested feature:

      # will scan only the right partitions
      $ hive -hiveconf hive.partition.evaluateonce=unix_timestamp -e "SELECT COUNT(*) FROM mybigtable WHERE mypartition >= from_unixtime(unix_timestamp()-2592000,'yyyy-MM-dd');

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              esteban Esteban Gutierrez
              Votes:
              3 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: