Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1802

Encode MapReduce Shuffling Keys Differently for Single string/bigint Key

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Query Processor
    • None

    Description

      Delimiters are not needed if we only have one shuffling key, and in the same time escaping delimiters are not needed. We can save some CPU time on serializing and shuffle slightly less amount of data to save memory footprint and network traffic.

      Also there is a bug that for group-by, we by mistake add a -1 to the end of the key and pay one more unnecessary mem-copy. Can be easily fixed.

      Attachments

        1. HIVE-1802.2.patch
          102 kB
          Siying Dong
        2. HIVE-1802.1.patch
          34 kB
          Siying Dong

        Activity

          People

            sdong Siying Dong
            sdong Siying Dong
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: