Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17417

LazySimple Timestamp is very expensive

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 2.4.0
    • Fix Version/s: 3.0.0, 2.4.0
    • Labels:
      None

      Description

      In a specific case where a schema contains array<struct> with timestamp and date fields (array size >10000). Any access to this column very very expensive in terms of CPU as most of the time is serialization of timestamp and date. Refer attached profiles. >70% time spent in serialization + tostring conversions.

        Attachments

        1. HIVE-17417.6.patch
          2 kB
          Prasanth Jayachandran
        2. HIVE-17417.5.patch
          2 kB
          Prasanth Jayachandran
        3. HIVE-17417.4.patch
          2 kB
          Prasanth Jayachandran
        4. HIVE-17417.3.patch
          3 kB
          Prasanth Jayachandran
        5. HIVE-17417.2.patch
          3 kB
          Prasanth Jayachandran
        6. HIVE-17417.1.patch
          2 kB
          Prasanth Jayachandran
        7. ts-jmh-perf.png
          17 kB
          Prasanth Jayachandran
        8. timestamp-serialize.png
          381 kB
          Prasanth Jayachandran
        9. date-serialize.png
          347 kB
          Prasanth Jayachandran

          Issue Links

            Activity

              People

              • Assignee:
                prasanth_j Prasanth Jayachandran
                Reporter:
                prasanth_j Prasanth Jayachandran
              • Votes:
                1 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: