Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3552

HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.11.0
    • Query Processor
    • None

    Description

      This is a follow up for HIVE-3433.

      Had a offline discussion with Sambavi - she pointed out a scenario where the
      implementation in HIVE-3433 will not scale. Assume that the user is performing
      a cube on many columns, say '8' columns. So, each row would generate 256 rows
      for the hash table, which may kill the current group by implementation.

      A better implementation would be to add an additional mr job - in the first
      mr job perform the group by assuming there was no cube. Add another mr job, where
      you would perform the cube. The assumption is that the group by would have
      decreased the output data significantly, and the rows would appear in the order of
      grouping keys which has a higher probability of hitting the hash table.

      Attachments

        1. hive.3552.1.patch
          180 kB
          Namit Jain
        2. hive.3552.10.patch
          226 kB
          Namit Jain
        3. hive.3552.11.patch
          226 kB
          Namit Jain
        4. hive.3552.12.patch
          226 kB
          Namit Jain
        5. hive.3552.2.patch
          179 kB
          Namit Jain
        6. hive.3552.3.patch
          219 kB
          Namit Jain
        7. hive.3552.4.patch
          219 kB
          Namit Jain
        8. hive.3552.5.patch
          221 kB
          Namit Jain
        9. hive.3552.6.patch
          221 kB
          Namit Jain
        10. hive.3552.7.patch
          221 kB
          Namit Jain
        11. hive.3552.8.patch
          226 kB
          Namit Jain
        12. hive.3552.9.patch
          226 kB
          Namit Jain

        Issue Links

          Activity

            People

              namit Namit Jain
              namit Namit Jain
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: