Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3777

add a property in the partition to figure out if stats are accurate

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.13.0
    • 0.13.0
    • Query Processor
    • None

    Description

      Currently, stats task tries to update the statistics in the table/partition
      being updated after the table/partition is loaded. In case of a failure to
      update these stats (due to the any reason), the operation either succeeds
      (writing inaccurate stats) or fails depending on whether hive.stats.reliable
      is set to true. This can be bad for applications who do not always care about
      reliable stats, since the query may have taken a long time to execute and then
      fail eventually.

      Another property should be added to the partition: areStatsAccurate. If hive.stats.reliable is
      set to false, and stats could not be computed correctly, the operation would
      still succeed, update the stats, but set areStatsAccurate to false.
      If the application cares about accurate stats, it can be obtained in the
      background.

      Attachments

        1. HIVE-3777.5.patch
          552 kB
          Ashutosh Chauhan
        2. HIVE-3777.4.patch
          552 kB
          Ashutosh Chauhan
        3. HIVE-3777.3.patch
          552 kB
          Ashutosh Chauhan
        4. HIVE-3777.2.patch
          541 kB
          Ashutosh Chauhan
        5. HIVE-3777.2.patch
          22 kB
          Ashutosh Chauhan
        6. HIVE-3777.patch
          13 kB
          Ashutosh Chauhan

        Issue Links

          Activity

            People

              ashutoshc Ashutosh Chauhan
              namit Namit Jain
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: