Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1940

Query Optimization Using Column Statistics and Histograms

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None
    • MetaStore

    Description

      The current basis for cost-based query optimization in Hive is information gathered on tables and partitions. To make further improvements in query optimization possible, the next step is to develop and implement possibilities to gather information on columns as discussed in issue HIVE-33. After that, an implementation of histograms is a possible option to use and collect run-time statistics. Next to the actual implementation of these features, it is also necessary to develop a consistent storage model for the MetaStore.

      Attachments

        1. HiveMetaStore.pdf
          221 kB
          Anja Gruenheid
        2. Agruenheid_ideas11.pdf
          253 kB
          Carl Steinbach

        Issue Links

          Activity

            People

              Unassigned Unassigned
              agruenheid Anja Gruenheid
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: