Hive
  1. Hive
  2. HIVE-2037

Merge result file size should honor hive.merge.size.per.task

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None

      Description

      The merge job set mapred.min.split.size to the value of hive.merge.size.per.task, which roughly equals to the output file size. However the input split size is also determined by mapred.min.split.size.per.node, mapred.min.split.size.per.rack, and mapred.max.split.size. They should be set the same as hive.merge.size.per.task as well.

        Activity

        Hide
        Joydeep Sen Sarma added a comment -

        looks ok - please run the tests and i will commit.

        Show
        Joydeep Sen Sarma added a comment - looks ok - please run the tests and i will commit.
        Hide
        Ning Zhang added a comment -

        @joy, the unit tests are clean.

        Show
        Ning Zhang added a comment - @joy, the unit tests are clean.
        Hide
        Joydeep Sen Sarma added a comment -

        committed. thanks Ning.

        Show
        Joydeep Sen Sarma added a comment - committed. thanks Ning.

          People

          • Assignee:
            Ning Zhang
            Reporter:
            Ning Zhang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development