HCatalog
  1. HCatalog
  2. HCATALOG-373

ProgressReporter should work with both old and new MR API

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.1
    • Component/s: None
    • Labels:
      None

      Description

      org.apache.hcatalog.mapreduce.ProgressReporter currently implements org.apache.hadoop.mapred.Reporter. It should also extend org.apache.hadoop.mapreduce.StatusReporter so it works with code expecting either an old or new API reporter.

      The use case is using a wrapper so a serde works with a new-API input format.

      https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/mapred/input/DeprecatedInputFormatWrapper.java#L163

      1. HCATALOG-373_progress_reporter_2.diff
        3 kB
        Travis Crawford
      2. HCATALOG-373_progress_reporter_3.diff
        15 kB
        Travis Crawford
      3. HCATALOG-373_progress_reporter_4.diff
        14 kB
        Travis Crawford
      4. HCATALOG-373_progress_reporter.diff
        3 kB
        Travis Crawford

        Activity

        Hide
        Rohini Palaniswamy added a comment -

        The Counters object is internal to the class and increment operations will not be reflected in the job counters. It would rather be better to leave it at empty implementation for incrCounter as before instead of having a local variable that will not be used anywhere. Or if you can fix it to call the TaskInputOutputContext (or the Reporter.NULL) methods for setStatus, incrCounter and getCounter it would be good. That way we can have RecordWriter's in the future that output some counters.

        Show
        Rohini Palaniswamy added a comment - The Counters object is internal to the class and increment operations will not be reflected in the job counters. It would rather be better to leave it at empty implementation for incrCounter as before instead of having a local variable that will not be used anywhere. Or if you can fix it to call the TaskInputOutputContext (or the Reporter.NULL) methods for setStatus, incrCounter and getCounter it would be good. That way we can have RecordWriter's in the future that output some counters.
        Hide
        Travis Crawford added a comment -

        Good suggestion Rohini, I've updated the patch to continue using Reporter.NULL. I briefly looked at adding proper counter support but I'm not immediately sure how that would work, and would prefer to leave for a future change.

        Show
        Travis Crawford added a comment - Good suggestion Rohini, I've updated the patch to continue using Reporter.NULL . I briefly looked at adding proper counter support but I'm not immediately sure how that would work, and would prefer to leave for a future change.
        Hide
        Rohini Palaniswamy added a comment -

        Looked at the new diff. Calling Reporter.NULL methods does not add any value. It just has empty implementations for void methods and "return null" for methods with return values. Can we directly do the same in ProgressReporter.java instead of calling Reporter.NULL? The previous code also had empty methods.

        Show
        Rohini Palaniswamy added a comment - Looked at the new diff. Calling Reporter.NULL methods does not add any value. It just has empty implementations for void methods and "return null" for methods with return values. Can we directly do the same in ProgressReporter.java instead of calling Reporter.NULL? The previous code also had empty methods.
        Hide
        Travis Crawford added a comment -

        Hey Rohini -

        I ended up just fixing the counters. I have it working and will cleanup + post the patch tomorrow.

        Currently the code creates the wrapped record reader in HCatBaseInputFormat.createRecordReader. If we delay creating the wrapped record reader until HCatRecordReader.initialize we have a TaskInputOutputContext and can correctly create a ProgressReporter with counters.

        Looking in MapTask.runNewMapper we see how the initialization works if you're interested in the details.

        Show
        Travis Crawford added a comment - Hey Rohini - I ended up just fixing the counters. I have it working and will cleanup + post the patch tomorrow. Currently the code creates the wrapped record reader in HCatBaseInputFormat.createRecordReader . If we delay creating the wrapped record reader until HCatRecordReader.initialize we have a TaskInputOutputContext and can correctly create a ProgressReporter with counters. Looking in MapTask.runNewMapper we see how the initialization works if you're interested in the details.
        Hide
        Travis Crawford added a comment -

        I believe this version of the patch is ready to go.

        Please see my previous comment for a description of why creation of the base record reader is delayed until initialized.

        ProgressReporter is updated to work in both old and new API contexts, which is sometimes required. In the particular case that caused me to look at this, the Elephant-Bird deprecated API wrappers were having issues with this reporter.

        Show
        Travis Crawford added a comment - I believe this version of the patch is ready to go. Please see my previous comment for a description of why creation of the base record reader is delayed until initialized. ProgressReporter is updated to work in both old and new API contexts, which is sometimes required. In the particular case that caused me to look at this, the Elephant-Bird deprecated API wrappers were having issues with this reporter.
        Hide
        Rohini Palaniswamy added a comment -

        Looks good Travis. +1 (Non-binding).

        Show
        Rohini Palaniswamy added a comment - Looks good Travis. +1 (Non-binding).
        Hide
        Travis Crawford added a comment -

        Hey all - can a committer take a look at this patch and let me know if there's anything else it needs?

        Show
        Travis Crawford added a comment - Hey all - can a committer take a look at this patch and let me know if there's anything else it needs?
        Hide
        Mithun Radhakrishnan added a comment -

        Hey, Travis. FWIW, I finally got my head around this. Looks good (although this will mean I'll have to rebase a patch that I have in flight. )

        (Tiny nitpick: Would you consider changing HCatRecordReader::initialize() to use the hcatSplit variable you've introduced, rather than cast split to HCatSplit again, later in initialize()?)

        Show
        Mithun Radhakrishnan added a comment - Hey, Travis. FWIW, I finally got my head around this. Looks good (although this will mean I'll have to rebase a patch that I have in flight. ) (Tiny nitpick: Would you consider changing HCatRecordReader::initialize() to use the hcatSplit variable you've introduced, rather than cast split to HCatSplit again, later in initialize() ?)
        Hide
        Francis Liu added a comment -

        Travis, the patch doesn't apply cleanly for me. Can you update it and I'll take a look. BTW would appreciate it if you could create a review board for it as well .

        Show
        Francis Liu added a comment - Travis, the patch doesn't apply cleanly for me. Can you update it and I'll take a look. BTW would appreciate it if you could create a review board for it as well .
        Hide
        Travis Crawford added a comment -

        Sure, I'll update to trunk, take a look at the initialize suggestion and post to review board.

        Show
        Travis Crawford added a comment - Sure, I'll update to trunk, take a look at the initialize suggestion and post to review board.
        Hide
        Travis Crawford added a comment -

        Patch updated against trunk, and fixed the suggestion about reusing the HCatSchema instead of recasting.

        This is also viewable at https://github.com/traviscrawford/hcatalog/compare/HCATALOG-373_update_progress_reporter

        I generated this patch with git diff --no-prefix apache/trunk > HCATALOG-373_progress_reporter_4.diff so it should patch cleanly into the SVN repo.

        Show
        Travis Crawford added a comment - Patch updated against trunk, and fixed the suggestion about reusing the HCatSchema instead of recasting. This is also viewable at https://github.com/traviscrawford/hcatalog/compare/HCATALOG-373_update_progress_reporter I generated this patch with git diff --no-prefix apache/trunk > HCATALOG-373 _progress_reporter_4.diff so it should patch cleanly into the SVN repo.
        Hide
        Travis Crawford added a comment -

        @francis - I'm getting the following error when posting to RB:

        Something broke! (Error 500)
        
        It appears something broke when you tried to go to here. This is either a bug in Review Board or a server configuration error. Please report this to your administrator.
        

        Is the github link okay to view the change? If not, I'll try posting this later when hopefully RB is back online.

        Show
        Travis Crawford added a comment - @francis - I'm getting the following error when posting to RB: Something broke! (Error 500) It appears something broke when you tried to go to here. This is either a bug in Review Board or a server configuration error. Please report this to your administrator. Is the github link okay to view the change? If not, I'll try posting this later when hopefully RB is back online.
        Hide
        Francis Liu added a comment -

        travis, when you created the review was it using the "hcatalog-git" repository and not hcatalog?

        Show
        Francis Liu added a comment - travis, when you created the review was it using the "hcatalog-git" repository and not hcatalog?
        Hide
        Travis Crawford added a comment -

        Aah, no, I tried against "hcatalog". Using "hcatalog-git" I was able to post the review. Thanks!

        https://reviews.apache.org/r/4971/

        Show
        Travis Crawford added a comment - Aah, no, I tried against "hcatalog". Using "hcatalog-git" I was able to post the review. Thanks! https://reviews.apache.org/r/4971/
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/
        -----------------------------------------------------------

        Review request for hcatalog and Francis Liu.

        Summary
        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.
        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs


        src/java/org/apache/hcatalog/common/HCatUtil.java cb6404a
        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e
        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4
        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        Diff: https://reviews.apache.org/r/4971/diff

        Testing
        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs src/java/org/apache/hcatalog/common/HCatUtil.java cb6404a src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7585
        -----------------------------------------------------------

        Looks good. Some small nitpicks. Would it be possible to add or augment an existing junit test to verify that the counters are updated correctly?

        src/java/org/apache/hcatalog/common/HCatUtil.java
        <https://reviews.apache.org/r/4971/#comment16770>

        can we move this to internal util?

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java
        <https://reviews.apache.org/r/4971/#comment16768>

        why not use hcatSplit here?

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java
        <https://reviews.apache.org/r/4971/#comment16769>

        and here?

        • Francis

        On 2012-05-02 17:55:29, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-02 17:55:29)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/common/HCatUtil.java cb6404a

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7585 ----------------------------------------------------------- Looks good. Some small nitpicks. Would it be possible to add or augment an existing junit test to verify that the counters are updated correctly? src/java/org/apache/hcatalog/common/HCatUtil.java < https://reviews.apache.org/r/4971/#comment16770 > can we move this to internal util? src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java < https://reviews.apache.org/r/4971/#comment16768 > why not use hcatSplit here? src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java < https://reviews.apache.org/r/4971/#comment16769 > and here? Francis On 2012-05-02 17:55:29, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-02 17:55:29) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/common/HCatUtil.java cb6404a src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7586
        -----------------------------------------------------------

        I'll take a look at adding a test for this, address the issues below and update the patch. Thanks for the feedback!

        src/java/org/apache/hcatalog/common/HCatUtil.java
        <https://reviews.apache.org/r/4971/#comment16773>

        Good suggestion, I'll move to internal util since its not intended for public use.

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java
        <https://reviews.apache.org/r/4971/#comment16771>

        Basically to minimize the diff - will update.

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java
        <https://reviews.apache.org/r/4971/#comment16772>

        I don't actually understand this whole properties loop, since its not used. Any clue why this is here? If not, mind if I just remove it entirely? Originally I left this alone since it was not strictly necessary for the patch.

        • Travis

        On 2012-05-02 17:55:29, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-02 17:55:29)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/common/HCatUtil.java cb6404a

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7586 ----------------------------------------------------------- I'll take a look at adding a test for this, address the issues below and update the patch. Thanks for the feedback! src/java/org/apache/hcatalog/common/HCatUtil.java < https://reviews.apache.org/r/4971/#comment16773 > Good suggestion, I'll move to internal util since its not intended for public use. src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java < https://reviews.apache.org/r/4971/#comment16771 > Basically to minimize the diff - will update. src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java < https://reviews.apache.org/r/4971/#comment16772 > I don't actually understand this whole properties loop, since its not used. Any clue why this is here? If not, mind if I just remove it entirely? Originally I left this alone since it was not strictly necessary for the patch. Travis On 2012-05-02 17:55:29, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-02 17:55:29) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/common/HCatUtil.java cb6404a src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/
        -----------------------------------------------------------

        (Updated 2012-05-05 01:23:50.756124)

        Review request for hcatalog and Francis Liu.

        Changes
        -------

        Patch updated to include a test that counters exist when input has been read. Initially I went to add a test specific to this functionality but there was significant overlap with what already exists, so I updated an existing test to check that the bytes read counter is set when data has actually been read. This seems like the best approach as the ongoing maintenance will be low. The check includes a comment about why the check exists to help future developers.

        Suggestions from the previous review have also been incorporated. Specifically, we reuse hcatSplit instead of recasting. As the whole setting properties loop is not used I removed that section.

        Summary
        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.
        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs (updated)


        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e
        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4
        src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081
        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd
        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0

        Diff: https://reviews.apache.org/r/4971/diff

        Testing
        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-05 01:23:50.756124) Review request for hcatalog and Francis Liu. Changes ------- Patch updated to include a test that counters exist when input has been read. Initially I went to add a test specific to this functionality but there was significant overlap with what already exists, so I updated an existing test to check that the bytes read counter is set when data has actually been read. This seems like the best approach as the ongoing maintenance will be low. The check includes a comment about why the check exists to help future developers. Suggestions from the previous review have also been incorporated. Specifically, we reuse hcatSplit instead of recasting. As the whole setting properties loop is not used I removed that section. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs (updated) src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-05-04 22:39:58, Travis Crawford wrote:

        > src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java, line 116

        > <https://reviews.apache.org/r/4971/diff/1/?file=105999#file105999line116>

        >

        > I don't actually understand this whole properties loop, since its not used. Any clue why this is here? If not, mind if I just remove it entirely? Originally I left this alone since it was not strictly necessary for the patch.

        No idea, looks like dead code to me. I don't mind having it removed as part of this patch.

        • Francis

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7586
        -----------------------------------------------------------

        On 2012-05-05 01:23:50, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-05 01:23:50)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-05-04 22:39:58, Travis Crawford wrote: > src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java, line 116 > < https://reviews.apache.org/r/4971/diff/1/?file=105999#file105999line116 > > > I don't actually understand this whole properties loop, since its not used. Any clue why this is here? If not, mind if I just remove it entirely? Originally I left this alone since it was not strictly necessary for the patch. No idea, looks like dead code to me. I don't mind having it removed as part of this patch. Francis ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7586 ----------------------------------------------------------- On 2012-05-05 01:23:50, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-05 01:23:50) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        Travis Crawford added a comment -

        Can a committer take a look at this patch? Feedback from the previous round has been incorporated, and I believe this is ready to go.

        Show
        Travis Crawford added a comment - Can a committer take a look at this patch? Feedback from the previous round has been incorporated, and I believe this is ready to go.
        Hide
        Francis Liu added a comment -

        Sorry, missed this. Will take a look at it.

        Show
        Francis Liu added a comment - Sorry, missed this. Will take a look at it.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7816
        -----------------------------------------------------------

        Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests.

        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java
        <https://reviews.apache.org/r/4971/#comment17147>

        The warehouse directory is set in the ant test target. If you're doing this for runs in your IDE then it'd be better to have this check if the sysproperty hive.metastore.warehouse.dir is set to support both cases.

        It'll probably better if you put the dir in /tmp similar to ant the config "/tmp/hcat_junit_warehouse". Or add ant into the name

        {user.home}

        /hive_build/...

        • Francis

        On 2012-05-05 01:23:50, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-05 01:23:50)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7816 ----------------------------------------------------------- Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests. src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java < https://reviews.apache.org/r/4971/#comment17147 > The warehouse directory is set in the ant test target. If you're doing this for runs in your IDE then it'd be better to have this check if the sysproperty hive.metastore.warehouse.dir is set to support both cases. It'll probably better if you put the dir in /tmp similar to ant the config "/tmp/hcat_junit_warehouse". Or add ant into the name {user.home} /hive_build/... Francis On 2012-05-05 01:23:50, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-05 01:23:50) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-05-11 23:42:33, Francis Liu wrote:

        > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests.

        Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it?

        • Travis

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7816
        -----------------------------------------------------------

        On 2012-05-05 01:23:50, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-05 01:23:50)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-05-11 23:42:33, Francis Liu wrote: > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests. Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it? Travis ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7816 ----------------------------------------------------------- On 2012-05-05 01:23:50, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-05 01:23:50) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-05-11 23:42:33, Francis Liu wrote:

        > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests.

        Travis Crawford wrote:

        Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it?

        Cool. I don't mind either way.

        • Francis

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7816
        -----------------------------------------------------------

        On 2012-05-05 01:23:50, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-05 01:23:50)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-05-11 23:42:33, Francis Liu wrote: > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests. Travis Crawford wrote: Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it? Cool. I don't mind either way. Francis ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7816 ----------------------------------------------------------- On 2012-05-05 01:23:50, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-05 01:23:50) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-05-11 23:42:33, Francis Liu wrote:

        > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests.

        Travis Crawford wrote:

        Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it?

        Francis Liu wrote:

        Cool. I don't mind either way.

        Hey Francis,

        What do you think about checking this in as-is, and we can address the logs directory in https://issues.apache.org/jira/browse/HCATALOG-374 ? There's a lot of inconsistent setting up of a test metastore and I'd like to consolidate that setup code in a base test class so updates like these are available to all tests. I posted a patch in that issue you can look at as an example of what I'm proposing.

        • Travis

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7816
        -----------------------------------------------------------

        On 2012-05-05 01:23:50, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-05 01:23:50)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-05-11 23:42:33, Francis Liu wrote: > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests. Travis Crawford wrote: Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it? Francis Liu wrote: Cool. I don't mind either way. Hey Francis, What do you think about checking this in as-is, and we can address the logs directory in https://issues.apache.org/jira/browse/HCATALOG-374 ? There's a lot of inconsistent setting up of a test metastore and I'd like to consolidate that setup code in a base test class so updates like these are available to all tests. I posted a patch in that issue you can look at as an example of what I'm proposing. Travis ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7816 ----------------------------------------------------------- On 2012-05-05 01:23:50, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-05 01:23:50) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-05-11 23:42:33, Francis Liu wrote:

        > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests.

        Travis Crawford wrote:

        Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it?

        Francis Liu wrote:

        Cool. I don't mind either way.

        Travis Crawford wrote:

        Hey Francis,

        What do you think about checking this in as-is, and we can address the logs directory in https://issues.apache.org/jira/browse/HCATALOG-374 ? There's a lot of inconsistent setting up of a test metastore and I'd like to consolidate that setup code in a base test class so updates like these are available to all tests. I posted a patch in that issue you can look at as an example of what I'm proposing.

        Hi Travis, if you're addressing this issue in a separate Jira. I'd prefer not checking in this portion of the patch as it might affect the automated builds.

        • Francis

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7816
        -----------------------------------------------------------

        On 2012-05-05 01:23:50, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-05 01:23:50)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-05-11 23:42:33, Francis Liu wrote: > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests. Travis Crawford wrote: Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it? Francis Liu wrote: Cool. I don't mind either way. Travis Crawford wrote: Hey Francis, What do you think about checking this in as-is, and we can address the logs directory in https://issues.apache.org/jira/browse/HCATALOG-374 ? There's a lot of inconsistent setting up of a test metastore and I'd like to consolidate that setup code in a base test class so updates like these are available to all tests. I posted a patch in that issue you can look at as an example of what I'm proposing. Hi Travis, if you're addressing this issue in a separate Jira. I'd prefer not checking in this portion of the patch as it might affect the automated builds. Francis ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7816 ----------------------------------------------------------- On 2012-05-05 01:23:50, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-05 01:23:50) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-05-11 23:42:33, Francis Liu wrote:

        > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests.

        Travis Crawford wrote:

        Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it?

        Francis Liu wrote:

        Cool. I don't mind either way.

        Travis Crawford wrote:

        Hey Francis,

        What do you think about checking this in as-is, and we can address the logs directory in https://issues.apache.org/jira/browse/HCATALOG-374 ? There's a lot of inconsistent setting up of a test metastore and I'd like to consolidate that setup code in a base test class so updates like these are available to all tests. I posted a patch in that issue you can look at as an example of what I'm proposing.

        Francis Liu wrote:

        Hi Travis, if you're addressing this issue in a separate Jira. I'd prefer not checking in this portion of the patch as it might affect the automated builds.

        The tests pass locally so CI should not be affected. In general I'm not a fan of builds producing things outside the build directory, because it makes locating test data and logs more difficult, and the clean target needs to do more than simply deleting the build directory, so there are opportunities for clean to miss things. Putting files in directories with the test class name ensures state does not get shared between tests.

        Please let me know what you'd like me to do here so we can wrap this up. This patch has been going for over a month now and I'd really like to reach closure and move on to new issues.

        • Travis

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7816
        -----------------------------------------------------------

        On 2012-05-05 01:23:50, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-05 01:23:50)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-05-11 23:42:33, Francis Liu wrote: > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests. Travis Crawford wrote: Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it? Francis Liu wrote: Cool. I don't mind either way. Travis Crawford wrote: Hey Francis, What do you think about checking this in as-is, and we can address the logs directory in https://issues.apache.org/jira/browse/HCATALOG-374 ? There's a lot of inconsistent setting up of a test metastore and I'd like to consolidate that setup code in a base test class so updates like these are available to all tests. I posted a patch in that issue you can look at as an example of what I'm proposing. Francis Liu wrote: Hi Travis, if you're addressing this issue in a separate Jira. I'd prefer not checking in this portion of the patch as it might affect the automated builds. The tests pass locally so CI should not be affected. In general I'm not a fan of builds producing things outside the build directory, because it makes locating test data and logs more difficult, and the clean target needs to do more than simply deleting the build directory, so there are opportunities for clean to miss things. Putting files in directories with the test class name ensures state does not get shared between tests. Please let me know what you'd like me to do here so we can wrap this up. This patch has been going for over a month now and I'd really like to reach closure and move on to new issues. Travis ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7816 ----------------------------------------------------------- On 2012-05-05 01:23:50, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-05 01:23:50) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-05-11 23:42:33, Francis Liu wrote:

        > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests.

        Travis Crawford wrote:

        Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it?

        Francis Liu wrote:

        Cool. I don't mind either way.

        Travis Crawford wrote:

        Hey Francis,

        What do you think about checking this in as-is, and we can address the logs directory in https://issues.apache.org/jira/browse/HCATALOG-374 ? There's a lot of inconsistent setting up of a test metastore and I'd like to consolidate that setup code in a base test class so updates like these are available to all tests. I posted a patch in that issue you can look at as an example of what I'm proposing.

        Francis Liu wrote:

        Hi Travis, if you're addressing this issue in a separate Jira. I'd prefer not checking in this portion of the patch as it might affect the automated builds.

        Travis Crawford wrote:

        The tests pass locally so CI should not be affected. In general I'm not a fan of builds producing things outside the build directory, because it makes locating test data and logs more difficult, and the clean target needs to do more than simply deleting the build directory, so there are opportunities for clean to miss things. Putting files in directories with the test class name ensures state does not get shared between tests.

        Please let me know what you'd like me to do here so we can wrap this up. This patch has been going for over a month now and I'd really like to reach closure and move on to new issues.

        I see, I'm no fan either and since you're addressing it in a separate jira. I'd rather we make the changes there rather than making it point to potentially another external directory. I'll check this in without this section. Let me know if you're ok with that.

        • Francis

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7816
        -----------------------------------------------------------

        On 2012-05-05 01:23:50, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-05 01:23:50)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-05-11 23:42:33, Francis Liu wrote: > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests. Travis Crawford wrote: Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it? Francis Liu wrote: Cool. I don't mind either way. Travis Crawford wrote: Hey Francis, What do you think about checking this in as-is, and we can address the logs directory in https://issues.apache.org/jira/browse/HCATALOG-374 ? There's a lot of inconsistent setting up of a test metastore and I'd like to consolidate that setup code in a base test class so updates like these are available to all tests. I posted a patch in that issue you can look at as an example of what I'm proposing. Francis Liu wrote: Hi Travis, if you're addressing this issue in a separate Jira. I'd prefer not checking in this portion of the patch as it might affect the automated builds. Travis Crawford wrote: The tests pass locally so CI should not be affected. In general I'm not a fan of builds producing things outside the build directory, because it makes locating test data and logs more difficult, and the clean target needs to do more than simply deleting the build directory, so there are opportunities for clean to miss things. Putting files in directories with the test class name ensures state does not get shared between tests. Please let me know what you'd like me to do here so we can wrap this up. This patch has been going for over a month now and I'd really like to reach closure and move on to new issues. I see, I'm no fan either and since you're addressing it in a separate jira. I'd rather we make the changes there rather than making it point to potentially another external directory. I'll check this in without this section. Let me know if you're ok with that. Francis ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7816 ----------------------------------------------------------- On 2012-05-05 01:23:50, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-05 01:23:50) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-05-11 23:42:33, Francis Liu wrote:

        > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests.

        Travis Crawford wrote:

        Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it?

        Francis Liu wrote:

        Cool. I don't mind either way.

        Travis Crawford wrote:

        Hey Francis,

        What do you think about checking this in as-is, and we can address the logs directory in https://issues.apache.org/jira/browse/HCATALOG-374 ? There's a lot of inconsistent setting up of a test metastore and I'd like to consolidate that setup code in a base test class so updates like these are available to all tests. I posted a patch in that issue you can look at as an example of what I'm proposing.

        Francis Liu wrote:

        Hi Travis, if you're addressing this issue in a separate Jira. I'd prefer not checking in this portion of the patch as it might affect the automated builds.

        Travis Crawford wrote:

        The tests pass locally so CI should not be affected. In general I'm not a fan of builds producing things outside the build directory, because it makes locating test data and logs more difficult, and the clean target needs to do more than simply deleting the build directory, so there are opportunities for clean to miss things. Putting files in directories with the test class name ensures state does not get shared between tests.

        Please let me know what you'd like me to do here so we can wrap this up. This patch has been going for over a month now and I'd really like to reach closure and move on to new issues.

        Francis Liu wrote:

        I see, I'm no fan either and since you're addressing it in a separate jira. I'd rather we make the changes there rather than making it point to potentially another external directory. I'll check this in without this section. Let me know if you're ok with that.

        That would be great. Thanks!

        • Travis

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/4971/#review7816
        -----------------------------------------------------------

        On 2012-05-05 01:23:50, Travis Crawford wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/4971/

        -----------------------------------------------------------

        (Updated 2012-05-05 01:23:50)

        Review request for hcatalog and Francis Liu.

        Summary

        -------

        Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters.

        This addresses bug HCATALOG-373.

        https://issues.apache.org/jira/browse/HCATALOG-373

        Diffs

        -----

        src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e

        src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4

        src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081

        src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd

        src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0

        Diff: https://reviews.apache.org/r/4971/diff

        Testing

        -------

        "ant clean test" passes

        I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up.

        Thanks,

        Travis

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-05-11 23:42:33, Francis Liu wrote: > Looks good to me. Just a concern with the change in the test output directory. If you agree with any of my proposals, I can change that on commit. Will run e2e tests. Travis Crawford wrote: Sure, either approach sounds fine. The main thing is being able to run in the IDE. Do you want to make the change at commit time or for me to make it? Francis Liu wrote: Cool. I don't mind either way. Travis Crawford wrote: Hey Francis, What do you think about checking this in as-is, and we can address the logs directory in https://issues.apache.org/jira/browse/HCATALOG-374 ? There's a lot of inconsistent setting up of a test metastore and I'd like to consolidate that setup code in a base test class so updates like these are available to all tests. I posted a patch in that issue you can look at as an example of what I'm proposing. Francis Liu wrote: Hi Travis, if you're addressing this issue in a separate Jira. I'd prefer not checking in this portion of the patch as it might affect the automated builds. Travis Crawford wrote: The tests pass locally so CI should not be affected. In general I'm not a fan of builds producing things outside the build directory, because it makes locating test data and logs more difficult, and the clean target needs to do more than simply deleting the build directory, so there are opportunities for clean to miss things. Putting files in directories with the test class name ensures state does not get shared between tests. Please let me know what you'd like me to do here so we can wrap this up. This patch has been going for over a month now and I'd really like to reach closure and move on to new issues. Francis Liu wrote: I see, I'm no fan either and since you're addressing it in a separate jira. I'd rather we make the changes there rather than making it point to potentially another external directory. I'll check this in without this section. Let me know if you're ok with that. That would be great. Thanks! Travis ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/#review7816 ----------------------------------------------------------- On 2012-05-05 01:23:50, Travis Crawford wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4971/ ----------------------------------------------------------- (Updated 2012-05-05 01:23:50) Review request for hcatalog and Francis Liu. Summary ------- Update ProgressReporter to work with both old and new mapreduce API. Delay creating the base record reader so we have a StatusReporter and can use counters. This addresses bug HCATALOG-373 . https://issues.apache.org/jira/browse/HCATALOG-373 Diffs ----- src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 Diff: https://reviews.apache.org/r/4971/diff Testing ------- "ant clean test" passes I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, which is why this issue originally came up. Thanks, Travis
        Hide
        Francis Liu added a comment -

        checked into trunk and branch-0.4. Thanks Travis.

        Show
        Francis Liu added a comment - checked into trunk and branch-0.4. Thanks Travis.

          People

          • Assignee:
            Travis Crawford
            Reporter:
            Travis Crawford
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development