1. Hive
  2. HIVE-5369

Annotate hive operator tree with statistics from metastore



      Currently the statistics gathered at table/partition level and column level are not used during query planning stage. Statistics at table/partition and column level can be used for optimizing the query plans. Basic statistics like uncompressed data size can be used for better reducer estimation. Other statistics like number of rows, distinct values of columns, average length of columns etc. can be used by Cost Based Optimizer (CBO) for making better query plan selection. As a first step in improving query planning the statistics that are available in the metastore should be attached to hive operator tree. The operator tree should be walked and annotated with statistics information. The attached statistics will vary for each operator depending on the operation it performs. For example, select operator will change the average row size but doesn't affect the number of rows. Similarly filter operator will change the number of rows but doesn't change the average row size. Similar rules can be applied for other operators as well.

      Rules for different operators are added as comments in the code. For more detailed information, the reference book that I am using is "Database Systems: The Complete Book" by Garcia-Molina

      1. HIVE-5369.1.txt
        750 kB
        Prasanth Jayachandran
      2. HIVE-5369.10.patch
        1.29 MB
        Prasanth Jayachandran
      3. HIVE-5369.2.patch.txt
        725 kB
        Prasanth Jayachandran
      4. HIVE-5369.2.WIP.txt
        874 kB
        Prasanth Jayachandran
      5. HIVE-5369.3.patch.txt
        718 kB
        Prasanth Jayachandran
      6. HIVE-5369.4.patch.txt
        796 kB
        Prasanth Jayachandran
      7. HIVE-5369.5.patch.txt
        800 kB
        Prasanth Jayachandran
      8. HIVE-5369.6.patch.txt
        803 kB
        Prasanth Jayachandran
      9. HIVE-5369.7.patch.txt
        1.23 MB
        Prasanth Jayachandran
      10. HIVE-5369.8.patch.txt
        1.27 MB
        Prasanth Jayachandran
      11. HIVE-5369.9.patch
        1.29 MB
        Gunther Hagleitner
      12. HIVE-5369.9.patch.txt
        1.29 MB
        Prasanth Jayachandran
      13. HIVE-5369.refactor.WIP.txt
        700 kB
        Prasanth Jayachandran
      14. HIVE-5369.WIP.txt
        146 kB
        Prasanth Jayachandran

        Issue Links


          No work has yet been logged on this issue.


            • Assignee:
              Prasanth Jayachandran
              Prasanth Jayachandran
            • Votes:
              0 Vote for this issue
              6 Start watching this issue


              • Created: