Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11031

ORC concatenation of old files can fail while merging column statistics

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 2.0.0
    • Fix Version/s: 1.2.1
    • Component/s: None
    • Labels:
      None

      Description

      Column statistics in ORC are optional protobuf fields. Old ORC files might not have statistics for newly added types like decimal, date, timestamp etc. But column statistics merging assumes column statistics exists for these types and invokes merge. For example, merging of TimestampColumnStatistics directly casts the received ColumnStatistics object without doing instanceof check. If the ORC file contains time stamp column statistics then this will work else it will throw ClassCastException.

      Also, the file merge operator swallows the exception.

        Attachments

        1. HIVE-11031.patch
          18 kB
          Prasanth Jayachandran
        2. HIVE-11031.2.patch
          32 kB
          Prasanth Jayachandran
        3. HIVE-11031.3.patch
          32 kB
          Prasanth Jayachandran
        4. HIVE-11031.4.patch
          32 kB
          Prasanth Jayachandran
        5. HIVE-11031-branch-1.0.patch
          31 kB
          Prasanth Jayachandran

          Issue Links

            Activity

              People

              • Assignee:
                prasanth_j Prasanth Jayachandran
                Reporter:
                prasanth_j Prasanth Jayachandran
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: