Hive
  1. Hive
  2. HIVE-2146

Block Sampling should adjust number of reducers accordingly to make it useful

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Now number of reducers of block sampling is not modified, so that queries like:
      select c from tab tablesample(1 percent) group by c;
      can generate huge number of reducers although the input is sampled to be small.
      We need to shrink number of reducers to make block sampling more useful.
      Since now number of reducers are determined before get splits, the way to do it probably is not clean enough, but we can do a good guess.

      1. HIVE-2146.2.patch
        2 kB
        Siying Dong
      2. HIVE-2146.1.patch
        2 kB
        Siying Dong

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Siying Dong
            Reporter:
            Siying Dong
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development