Hive
  1. Hive
  2. HIVE-2440

make hive mapper initialize faster when having tons of input files

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      when one hive job has tons of input files, a lot of mappers may fail because of slow initialization.

      1. HIVE-2440.3.patch
        4 kB
        He Yongqiang
      2. HIVE-2440.2.patch
        4 kB
        He Yongqiang
      3. HIVE-2440.1.patch
        3 kB
        He Yongqiang

        Activity

        He Yongqiang created issue -
        He Yongqiang made changes -
        Field Original Value New Value
        Attachment HIVE-2440.1.patch [ 12493854 ]
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1783/
        -----------------------------------------------------------

        Review request for hive and Ning Zhang.

        Summary
        -------

        when one hive job has tons of input files, a lot of mappers may fail because of slow initialization.

        This addresses bug HIVE-2440.
        https://issues.apache.org/jira/browse/HIVE-2440

        Diffs


        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java 1167289

        Diff: https://reviews.apache.org/r/1783/diff

        Testing
        -------

        Thanks,

        Yongqiang

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1783/ ----------------------------------------------------------- Review request for hive and Ning Zhang. Summary ------- when one hive job has tons of input files, a lot of mappers may fail because of slow initialization. This addresses bug HIVE-2440 . https://issues.apache.org/jira/browse/HIVE-2440 Diffs trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java 1167289 Diff: https://reviews.apache.org/r/1783/diff Testing ------- Thanks, Yongqiang
        Hide
        He Yongqiang added a comment -

        This fixes test failure on combine3

        Show
        He Yongqiang added a comment - This fixes test failure on combine3
        He Yongqiang made changes -
        Attachment HIVE-2440.2.patch [ 12493866 ]
        Show
        He Yongqiang added a comment - https://reviews.apache.org/r/1813/
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1813/
        -----------------------------------------------------------

        Review request for hive and Ning Zhang.

        Summary
        -------

        make hive mapper initialize faster when having tons of input files

        This addresses bug hive-2440.
        https://issues.apache.org/jira/browse/hive-2440

        Diffs


        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java 1167289

        Diff: https://reviews.apache.org/r/1813/diff

        Testing
        -------

        Thanks,

        Yongqiang

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1813/ ----------------------------------------------------------- Review request for hive and Ning Zhang. Summary ------- make hive mapper initialize faster when having tons of input files This addresses bug hive-2440. https://issues.apache.org/jira/browse/hive-2440 Diffs trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java 1167289 Diff: https://reviews.apache.org/r/1813/diff Testing ------- Thanks, Yongqiang
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1813/#review1859
        -----------------------------------------------------------

        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java
        <https://reviews.apache.org/r/1813/#comment4250>

        do you need childrenPaths? It's only used for adding paths, but nobody is reading it.

        • Ning

        On 2011-09-12 19:15:54, Yongqiang He wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/1813/

        -----------------------------------------------------------

        (Updated 2011-09-12 19:15:54)

        Review request for hive and Ning Zhang.

        Summary

        -------

        make hive mapper initialize faster when having tons of input files

        This addresses bug hive-2440.

        https://issues.apache.org/jira/browse/hive-2440

        Diffs

        -----

        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java 1167289

        Diff: https://reviews.apache.org/r/1813/diff

        Testing

        -------

        Thanks,

        Yongqiang

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1813/#review1859 ----------------------------------------------------------- trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java < https://reviews.apache.org/r/1813/#comment4250 > do you need childrenPaths? It's only used for adding paths, but nobody is reading it. Ning On 2011-09-12 19:15:54, Yongqiang He wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1813/ ----------------------------------------------------------- (Updated 2011-09-12 19:15:54) Review request for hive and Ning Zhang. Summary ------- make hive mapper initialize faster when having tons of input files This addresses bug hive-2440. https://issues.apache.org/jira/browse/hive-2440 Diffs ----- trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java 1167289 Diff: https://reviews.apache.org/r/1813/diff Testing ------- Thanks, Yongqiang
        Hide
        He Yongqiang added a comment -

        removed childrenPaths from MapOp

        Show
        He Yongqiang added a comment - removed childrenPaths from MapOp
        He Yongqiang made changes -
        Attachment HIVE-2440.3.patch [ 12494122 ]
        Hide
        Ning Zhang added a comment -

        +1. Will commit if tests pass.

        Show
        Ning Zhang added a comment - +1. Will commit if tests pass.
        Hide
        Ning Zhang added a comment -

        Committed. Thanks Yongqiang!

        Show
        Ning Zhang added a comment - Committed. Thanks Yongqiang!
        Ning Zhang made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Fix Version/s 0.9.0 [ 12317742 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in Hive-trunk-h0.21 #953 (See https://builds.apache.org/job/Hive-trunk-h0.21/953/)
        HIVE-2440. make hive mapper initialize faster when having tons of input files (Yongqiang He via Ning Zhang)

        nzhang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170453
        Files :

        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java
        Show
        Hudson added a comment - Integrated in Hive-trunk-h0.21 #953 (See https://builds.apache.org/job/Hive-trunk-h0.21/953/ ) HIVE-2440 . make hive mapper initialize faster when having tons of input files (Yongqiang He via Ning Zhang) nzhang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1170453 Files : /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java
        Carl Steinbach made changes -
        Fix Version/s 0.8.0 [ 12316178 ]
        Carl Steinbach made changes -
        Fix Version/s 0.9.0 [ 12317742 ]
        Carl Steinbach made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        4d 11h 41m 1 Ning Zhang 14/Sep/11 07:59
        Resolved Resolved Closed Closed
        93d 16h 57m 1 Carl Steinbach 16/Dec/11 23:56

          People

          • Assignee:
            He Yongqiang
            Reporter:
            He Yongqiang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development