Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-2566

reduce the number map-reduce jobs for union all

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.8.0
    • None
    • None

    Description

      A query like:

      select s.key, s.value from (
      select key, value from src2 where key < 10
      union all
      select key, value from src3 where key < 10
      union all
      select key, value from src4 where key < 10
      union all
      select key, count(1) as value from src5 group by key
      )s;

      should run the last sub-query
      'select key, count(1) as value from src5 group by key'
      as a map-reduce job.

      And then the union should be a map-only job reading from the first 3 map-only subqueries
      and the output of the last map-reduce job.

      The current plan is very inefficient.

      Attachments

        Issue Links

          Activity

            People

              namit Namit Jain
              namit Namit Jain
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: