Hive
  1. Hive
  2. HIVE-2479

Log more Hadoop task counter values in the MapRedStats class.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None

      Description

      We should log more of the Hadoop task tracker counters in the MapRedStats class, in order to make them available to hooks and improve logging.

      We should make the counters that are logged configurable, so that if different Hadoop versions are used, different counters can be collected.

      1. HIVE-2479.4.patch.txt
        6 kB
        Kevin Wilfong
      2. HIVE-2479.3.patch.txt
        5 kB
        Kevin Wilfong
      3. HIVE-2479.2.patch.txt
        7 kB
        Kevin Wilfong
      4. HIVE-2479.1.patch.txt
        5 kB
        Kevin Wilfong

        Activity

        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2167/
        -----------------------------------------------------------

        (Updated 2011-10-04 17:27:43.657392)

        Review request for hive, Ramkumar Vadali and Yongqiang He.

        Summary
        -------

        I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

        I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.

        This addresses bug Hive-2479.
        https://issues.apache.org/jira/browse/Hive-2479

        Diffs


        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612
        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612

        Diff: https://reviews.apache.org/r/2167/diff

        Testing
        -------

        I ran some queries to verify the counters were being populated.

        I also ran a few of the unit test queries to verify I hadn't broken anything.

        Thanks,

        Kevin

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/ ----------------------------------------------------------- (Updated 2011-10-04 17:27:43.657392) Review request for hive, Ramkumar Vadali and Yongqiang He. Summary ------- I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them. I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1. This addresses bug Hive-2479. https://issues.apache.org/jira/browse/Hive-2479 Diffs trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 Diff: https://reviews.apache.org/r/2167/diff Testing ------- I ran some queries to verify the counters were being populated. I also ran a few of the unit test queries to verify I hadn't broken anything. Thanks, Kevin
        Hide
        He Yongqiang added a comment -

        Let's put all job counters into a map in MapRedStats (not just the one listed above), and a hook can do anything he wants

        Show
        He Yongqiang added a comment - Let's put all job counters into a map in MapRedStats (not just the one listed above), and a hook can do anything he wants
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2167/#review2316
        -----------------------------------------------------------

        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java
        <https://reviews.apache.org/r/2167/#comment5322>

        don't put the counter names here, let's use a map and pass it the hook

        • Yongqiang

        On 2011-10-04 17:27:43, Kevin Wilfong wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2167/

        -----------------------------------------------------------

        (Updated 2011-10-04 17:27:43)

        Review request for hive, Ramkumar Vadali and Yongqiang He.

        Summary

        -------

        I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

        I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.

        This addresses bug Hive-2479.

        https://issues.apache.org/jira/browse/Hive-2479

        Diffs

        -----

        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612

        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612

        Diff: https://reviews.apache.org/r/2167/diff

        Testing

        -------

        I ran some queries to verify the counters were being populated.

        I also ran a few of the unit test queries to verify I hadn't broken anything.

        Thanks,

        Kevin

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/#review2316 ----------------------------------------------------------- trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java < https://reviews.apache.org/r/2167/#comment5322 > don't put the counter names here, let's use a map and pass it the hook Yongqiang On 2011-10-04 17:27:43, Kevin Wilfong wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/ ----------------------------------------------------------- (Updated 2011-10-04 17:27:43) Review request for hive, Ramkumar Vadali and Yongqiang He. Summary ------- I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them. I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1. This addresses bug Hive-2479. https://issues.apache.org/jira/browse/Hive-2479 Diffs ----- trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 Diff: https://reviews.apache.org/r/2167/diff Testing ------- I ran some queries to verify the counters were being populated. I also ran a few of the unit test queries to verify I hadn't broken anything. Thanks, Kevin
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2167/
        -----------------------------------------------------------

        (Updated 2011-10-04 17:38:18.076011)

        Review request for hive, Ramkumar Vadali and Yongqiang He.

        Changes
        -------

        I added everything from Task$Counter except CPU_MILLISECONDS because currently that receives special treatment in HadoopJobExecHelper.

        Summary
        -------

        I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

        I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.

        This addresses bug Hive-2479.
        https://issues.apache.org/jira/browse/Hive-2479

        Diffs (updated)


        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612
        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612

        Diff: https://reviews.apache.org/r/2167/diff

        Testing
        -------

        I ran some queries to verify the counters were being populated.

        I also ran a few of the unit test queries to verify I hadn't broken anything.

        Thanks,

        Kevin

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/ ----------------------------------------------------------- (Updated 2011-10-04 17:38:18.076011) Review request for hive, Ramkumar Vadali and Yongqiang He. Changes ------- I added everything from Task$Counter except CPU_MILLISECONDS because currently that receives special treatment in HadoopJobExecHelper. Summary ------- I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them. I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1. This addresses bug Hive-2479. https://issues.apache.org/jira/browse/Hive-2479 Diffs (updated) trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 Diff: https://reviews.apache.org/r/2167/diff Testing ------- I ran some queries to verify the counters were being populated. I also ran a few of the unit test queries to verify I hadn't broken anything. Thanks, Kevin
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2167/
        -----------------------------------------------------------

        (Updated 2011-10-04 22:58:24.529007)

        Review request for hive, Ramkumar Vadali and Yongqiang He.

        Changes
        -------

        MapRedStats provides an enum which just contains those counters that were being used before this diff. I also made it take a generic enum, and made the task counter enum used by HadoopJobExecHelper configurable. This way, if one Hadoop version contains more enums than another, the counters to be logged can be modified by creating a new enum.

        Summary
        -------

        I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

        I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.

        This addresses bug Hive-2479.
        https://issues.apache.org/jira/browse/Hive-2479

        Diffs (updated)


        trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1178612
        trunk/conf/hive-default.xml 1178612
        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612
        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612

        Diff: https://reviews.apache.org/r/2167/diff

        Testing
        -------

        I ran some queries to verify the counters were being populated.

        I also ran a few of the unit test queries to verify I hadn't broken anything.

        Thanks,

        Kevin

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/ ----------------------------------------------------------- (Updated 2011-10-04 22:58:24.529007) Review request for hive, Ramkumar Vadali and Yongqiang He. Changes ------- MapRedStats provides an enum which just contains those counters that were being used before this diff. I also made it take a generic enum, and made the task counter enum used by HadoopJobExecHelper configurable. This way, if one Hadoop version contains more enums than another, the counters to be logged can be modified by creating a new enum. Summary ------- I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them. I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1. This addresses bug Hive-2479. https://issues.apache.org/jira/browse/Hive-2479 Diffs (updated) trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1178612 trunk/conf/hive-default.xml 1178612 trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 Diff: https://reviews.apache.org/r/2167/diff Testing ------- I ran some queries to verify the counters were being populated. I also ran a few of the unit test queries to verify I hadn't broken anything. Thanks, Kevin
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2167/#review2335
        -----------------------------------------------------------

        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java
        <https://reviews.apache.org/r/2167/#comment5396>

        remove the enum, just either put all the counters objects into this MapRedStats, or put a map<counterGrpAndName, counterInst> here.

        For simplicity, we can just put the counters object here and hooks can anything they want.

        • Yongqiang

        On 2011-10-04 22:58:24, Kevin Wilfong wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2167/

        -----------------------------------------------------------

        (Updated 2011-10-04 22:58:24)

        Review request for hive, Ramkumar Vadali and Yongqiang He.

        Summary

        -------

        I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

        I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.

        This addresses bug Hive-2479.

        https://issues.apache.org/jira/browse/Hive-2479

        Diffs

        -----

        trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1178612

        trunk/conf/hive-default.xml 1178612

        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612

        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612

        Diff: https://reviews.apache.org/r/2167/diff

        Testing

        -------

        I ran some queries to verify the counters were being populated.

        I also ran a few of the unit test queries to verify I hadn't broken anything.

        Thanks,

        Kevin

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/#review2335 ----------------------------------------------------------- trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java < https://reviews.apache.org/r/2167/#comment5396 > remove the enum, just either put all the counters objects into this MapRedStats, or put a map<counterGrpAndName, counterInst> here. For simplicity, we can just put the counters object here and hooks can anything they want. Yongqiang On 2011-10-04 22:58:24, Kevin Wilfong wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/ ----------------------------------------------------------- (Updated 2011-10-04 22:58:24) Review request for hive, Ramkumar Vadali and Yongqiang He. Summary ------- I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them. I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1. This addresses bug Hive-2479. https://issues.apache.org/jira/browse/Hive-2479 Diffs ----- trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1178612 trunk/conf/hive-default.xml 1178612 trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 Diff: https://reviews.apache.org/r/2167/diff Testing ------- I ran some queries to verify the counters were being populated. I also ran a few of the unit test queries to verify I hadn't broken anything. Thanks, Kevin
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2167/
        -----------------------------------------------------------

        (Updated 2011-10-05 17:58:04.685771)

        Review request for hive, Ramkumar Vadali and Yongqiang He.

        Changes
        -------

        Thanks Yongqiang, that's a way better idea. I implemented it, again leaving cpuMsec because of the special logic for it in HadoopJobExecHelper, but converting all other counters to use it.

        Summary
        -------

        I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

        I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.

        This addresses bug Hive-2479.
        https://issues.apache.org/jira/browse/Hive-2479

        Diffs (updated)


        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612
        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612

        Diff: https://reviews.apache.org/r/2167/diff

        Testing
        -------

        I ran some queries to verify the counters were being populated.

        I also ran a few of the unit test queries to verify I hadn't broken anything.

        Thanks,

        Kevin

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/ ----------------------------------------------------------- (Updated 2011-10-05 17:58:04.685771) Review request for hive, Ramkumar Vadali and Yongqiang He. Changes ------- Thanks Yongqiang, that's a way better idea. I implemented it, again leaving cpuMsec because of the special logic for it in HadoopJobExecHelper, but converting all other counters to use it. Summary ------- I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them. I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1. This addresses bug Hive-2479. https://issues.apache.org/jira/browse/Hive-2479 Diffs (updated) trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 Diff: https://reviews.apache.org/r/2167/diff Testing ------- I ran some queries to verify the counters were being populated. I also ran a few of the unit test queries to verify I hadn't broken anything. Thanks, Kevin
        Hide
        He Yongqiang added a comment -

        Looks good!

        a small comment, and also can you rebase?

        Show
        He Yongqiang added a comment - Looks good! a small comment, and also can you rebase?
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2167/#review2355
        -----------------------------------------------------------

        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java
        <https://reviews.apache.org/r/2167/#comment5419>

        We may get NPE, so use code like this:

        Counter ctr = null;
        if(counters != null) {
        ctr= counters.findCounter(...);
        if (ctr != null)

        { ...... }

        }
        Or put this to a common method.

        • Yongqiang

        On 2011-10-05 17:58:04, Kevin Wilfong wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2167/

        -----------------------------------------------------------

        (Updated 2011-10-05 17:58:04)

        Review request for hive, Ramkumar Vadali and Yongqiang He.

        Summary

        -------

        I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

        I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.

        This addresses bug Hive-2479.

        https://issues.apache.org/jira/browse/Hive-2479

        Diffs

        -----

        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612

        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612

        Diff: https://reviews.apache.org/r/2167/diff

        Testing

        -------

        I ran some queries to verify the counters were being populated.

        I also ran a few of the unit test queries to verify I hadn't broken anything.

        Thanks,

        Kevin

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/#review2355 ----------------------------------------------------------- trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java < https://reviews.apache.org/r/2167/#comment5419 > We may get NPE, so use code like this: Counter ctr = null; if(counters != null) { ctr= counters.findCounter(...); if (ctr != null) { ...... } } Or put this to a common method. Yongqiang On 2011-10-05 17:58:04, Kevin Wilfong wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/ ----------------------------------------------------------- (Updated 2011-10-05 17:58:04) Review request for hive, Ramkumar Vadali and Yongqiang He. Summary ------- I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them. I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1. This addresses bug Hive-2479. https://issues.apache.org/jira/browse/Hive-2479 Diffs ----- trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1178612 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1178612 Diff: https://reviews.apache.org/r/2167/diff Testing ------- I ran some queries to verify the counters were being populated. I also ran a few of the unit test queries to verify I hadn't broken anything. Thanks, Kevin
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2167/
        -----------------------------------------------------------

        (Updated 2011-10-05 19:38:10.946692)

        Review request for hive, Ramkumar Vadali and Yongqiang He.

        Changes
        -------

        Nice catch Yongqiang, I'm sorry I let that slip through. I rebased and made the change.

        Summary
        -------

        I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them.

        I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1.

        This addresses bug Hive-2479.
        https://issues.apache.org/jira/browse/Hive-2479

        Diffs (updated)


        trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1179378
        trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1179378

        Diff: https://reviews.apache.org/r/2167/diff

        Testing
        -------

        I ran some queries to verify the counters were being populated.

        I also ran a few of the unit test queries to verify I hadn't broken anything.

        Thanks,

        Kevin

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2167/ ----------------------------------------------------------- (Updated 2011-10-05 19:38:10.946692) Review request for hive, Ramkumar Vadali and Yongqiang He. Changes ------- Nice catch Yongqiang, I'm sorry I let that slip through. I rebased and made the change. Summary ------- I added the counters mentioned in the task to the MapRedStats class, and modified HadoopJobExecHelper to collect them. I got tired of writing the same code over and over again, so I modified the way MapRedStats and HadoopJobExecHelper treat task counters. MapRedStats now has an enum with all of the task counters we want to collect, it is a subset of the enum in Task$Counter. Task is package private so the enum in it is unavailable. MapRedStats now contains a map from the enum values to the values of the counters, if they were set. HadoopJobExecHelper loops over the enum values and tries to get a value for each counter. As long as the new getter and setter methods are used the functionality is the same, in particular for the getter, if a counter was set, it returns the value of the counter, otherwise it returns -1. This addresses bug Hive-2479. https://issues.apache.org/jira/browse/Hive-2479 Diffs (updated) trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java 1179378 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1179378 Diff: https://reviews.apache.org/r/2167/diff Testing ------- I ran some queries to verify the counters were being populated. I also ran a few of the unit test queries to verify I hadn't broken anything. Thanks, Kevin
        Hide
        He Yongqiang added a comment -

        +1, will commit after tests pass.

        Show
        He Yongqiang added a comment - +1, will commit after tests pass.
        Hide
        He Yongqiang added a comment -

        Committed, thanks Kevin Wilfong!

        Show
        He Yongqiang added a comment - Committed, thanks Kevin Wilfong!
        Hide
        Hudson added a comment -

        Integrated in Hive-trunk-h0.21 #995 (See https://builds.apache.org/job/Hive-trunk-h0.21/995/)
        HIVE-2479: Log more Hadoop task counter values in the MapRedStats class (Kevin Wilfong via He Yongqiang)

        heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179493
        Files :

        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java
        Show
        Hudson added a comment - Integrated in Hive-trunk-h0.21 #995 (See https://builds.apache.org/job/Hive-trunk-h0.21/995/ ) HIVE-2479 : Log more Hadoop task counter values in the MapRedStats class (Kevin Wilfong via He Yongqiang) heyongqiang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1179493 Files : /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java

          People

          • Assignee:
            Kevin Wilfong
            Reporter:
            Kevin Wilfong
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development