Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12309

TableScan should use column stats when available for better data size estimate

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.14.0, 1.0.0, 1.1.0, 1.2.0
    • 2.0.0
    • Statistics
    • None

    Description

      Currently, all other operators use column stats to figure out data size, whereas TableScan relies on rawDataSize. This inconsistency can result in an inconsistency where TS may have lower Datasize then subsequent operators.

      Attachments

        1. HIVE-12309.2.patch
          246 kB
          Ashutosh Chauhan
        2. HIVE-12309.patch
          112 kB
          Ashutosh Chauhan

        Issue Links

          Activity

            People

              ashutoshc Ashutosh Chauhan
              ashutoshc Ashutosh Chauhan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: