Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5483

use metastore statistics to optimize max/min/etc. queries

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.13.0
    • None
    • None

    Description

      We have discussed this a little bit.
      Hive can answer queries such as select max(c1) from t purely from metastore using partition statistics, provided that we know the statistics are up to date.
      All data changes (e.g. adding new partitions) currently go thru metastore so we can track up-to-date-ness. If they are not up-to-date, the queries will have to read data (at least for outdated partitions) until someone runs analyze table. We can also analyze new partitions after add, if that is configured/specified in the command.

      Attachments

        1. HIVE-5483.patch
          105 kB
          Ashutosh Chauhan
        2. HIVE-5483.2.patch
          165 kB
          Ashutosh Chauhan
        3. HIVE-5483.3.patch
          173 kB
          Ashutosh Chauhan

        Issue Links

          Activity

            People

              ashutoshc Ashutosh Chauhan
              sershe Sergey Shelukhin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: