Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7736

improve the columns stats update speed for all the partitions of a table

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.14.0
    • None
    • None

    Description

      The current implementation of columns stats update for all the partitions of a table takes a long time when there are thousands of partitions.

      For example, on a given cluster, it took 600+ seconds to update all the partitions' columns stats for a table with 2 columns but 2000 partitions.

      ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for columns;

      We would like to improve the columns stats update speed for all the partitions of a table

      Attachments

        1. HIVE-7736.0.patch
          757 kB
          Pengcheng Xiong
        2. HIVE-7736.1.patch
          757 kB
          Pengcheng Xiong
        3. HIVE-7736.2.patch
          833 kB
          Pengcheng Xiong
        4. HIVE-7736.3.patch
          833 kB
          Pengcheng Xiong
        5. HIVE-7736.4.patch
          833 kB
          Pengcheng Xiong

        Issue Links

          Activity

            People

              pxiong Pengcheng Xiong
              pxiong Pengcheng Xiong
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: