Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6518

Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 0.13.0
    • 0.13.0
    • Query Processor
    • None
    • Flush VectorGroupBy aggregation hashes in case of a full GC

    Description

      The current VectorGroupByOperator implementation flushes the in-memory hashes when the maximum entries or fraction of memory is hit.

      This works for most cases, but there are some corner cases where we hit GC ovehead limits or heap size limits before either of those conditions are reached due to the rest of the pipeline.

      This patch adds a SoftReference as a GC canary. If the soft reference is dead, then a full GC pass happened sometime in the near past & the aggregation hashtables should be flushed immediately before another full GC is triggered.

      Attachments

        1. HIVE-6518.3.patch
          3 kB
          Gunther Hagleitner
        2. HIVE-6518.2-tez.patch
          3 kB
          Gopal Vijayaraghavan
        3. HIVE-6518.2.patch
          3 kB
          Gunther Hagleitner
        4. HIVE-6518.1-tez.patch
          2 kB
          Gopal Vijayaraghavan

        Activity

          People

            gopalv Gopal Vijayaraghavan
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: