Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-432

SORT BY is always sending data to only the first reducer even if there are multiple reducers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 0.3.0
    • Query Processor
    • None
    • Reviewed

    Description

      When we generate the ReduceSInkOperator, the partition columns are empty, which means all the rows will get a hash value of 0, and they will all go to the first reducer.

      In the meanwhile we are fixing this bug, please use "CLUSTER BY" instead of "SORT BY" so that the data will get distributed to multiple reducers.

      Attachments

        1. HIVE-432.1.patch
          2 kB
          Zheng Shao

        Activity

          People

            zshao Zheng Shao
            zshao Zheng Shao
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: