Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6578

Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.13.0
    • 0.13.0
    • None

    Description

      ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format.

      Attachments

        1. HIVE-6578.4.patch.txt
          486 kB
          Prasanth Jayachandran
        2. HIVE-6578.4.patch
          486 kB
          Prasanth Jayachandran
        3. HIVE-6578.3.patch
          486 kB
          Prasanth Jayachandran
        4. HIVE-6578.2.patch
          328 kB
          Prasanth Jayachandran
        5. HIVE-6578.1.patch
          175 kB
          Prasanth Jayachandran

        Issue Links

          Activity

            People

              prasanth_j Prasanth Jayachandran
              prasanth_j Prasanth Jayachandran
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: