Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16663

String Caching For Rows

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.0.1
    • Fix Version/s: 3.0.0
    • Component/s: Beeline
    • Labels:
      None
    • Flags:
      Patch

      Description

      It is very common that there are many repeated values in the result set of a query, especially when JOINs are present in the query. As it currently stands, beeline does not attempt to cache any of these values and therefore it consumes a lot of memory.

      Adding a string cache may save a lot of memory. There are organizations that use beeline to perform ETL processing of result sets into CSV. This will better support those organizations.

        Attachments

        1. HIVE-16663.7.patch
          1 kB
          Naveen Gangam
        2. HIVE-16663.7.patch
          1 kB
          David Mollitor
        3. HIVE-16663.6.patch
          2 kB
          David Mollitor
        4. HIVE-16663.5.patch
          1 kB
          David Mollitor
        5. HIVE-16663.4.patch
          2 kB
          David Mollitor
        6. HIVE-16663.3.patch
          1 kB
          David Mollitor
        7. HIVE-16663.2.patch
          2 kB
          David Mollitor
        8. HIVE-16663.1.patch
          2 kB
          David Mollitor

          Activity

            People

            • Assignee:
              belugabehr David Mollitor
              Reporter:
              belugabehr David Mollitor
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: