HCatalog
  1. HCatalog
  2. HCATALOG-160

HBaseDirectOutputStorageDriver outputVersion isn't consitent within the same MR job

    Details

      Description

      a single MR job should use the same revision number, the bug causes a new one to be made for each mapper

      1. HCATALOG-160.patch
        50 kB
        Francis Liu
      2. HCATALOG-160_svn.patch
        50 kB
        Francis Liu
      3. HCATALOG-160_rev2.patch
        23 kB
        Francis Liu

        Activity

        Hide
        Alan Gates added a comment -

        Issue closed with 0.4 release.

        Show
        Alan Gates added a comment - Issue closed with 0.4 release.
        Hide
        Ashutosh Chauhan added a comment -

        Patch committed to trunk. Thanks, Francis!

        Show
        Ashutosh Chauhan added a comment - Patch committed to trunk. Thanks, Francis!
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2809/
        -----------------------------------------------------------

        (Updated 2011-11-23 07:03:08.841754)

        Review request for hcatalog, Sushanth Sowmyan, Vandana Ayyalasomayajula, and David Capwell.

        Changes
        -------

        updated patch with changes in trunk

        Summary
        -------

        HBaseDirectOutputStorageDriver missed serializing the updated OutputJobInfo, fixed that.

        This addresses bug hcatalog-160.
        https://issues.apache.org/jira/browse/hcatalog-160

        Diffs (updated)


        storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseBaseOutputStorageDriver.java 596f942
        storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java 65dfccb
        storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputStorageDriver.java 1c5c60d

        Diff: https://reviews.apache.org/r/2809/diff

        Testing
        -------

        Updated unit tests to such a scenario and it passes now.

        Thanks,

        Francis

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2809/ ----------------------------------------------------------- (Updated 2011-11-23 07:03:08.841754) Review request for hcatalog, Sushanth Sowmyan, Vandana Ayyalasomayajula, and David Capwell. Changes ------- updated patch with changes in trunk Summary ------- HBaseDirectOutputStorageDriver missed serializing the updated OutputJobInfo, fixed that. This addresses bug hcatalog-160. https://issues.apache.org/jira/browse/hcatalog-160 Diffs (updated) storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseBaseOutputStorageDriver.java 596f942 storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java 65dfccb storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputStorageDriver.java 1c5c60d Diff: https://reviews.apache.org/r/2809/diff Testing ------- Updated unit tests to such a scenario and it passes now. Thanks, Francis
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-11-23 00:23:02, Francis Liu wrote:

        > storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java, line 41

        > <https://reviews.apache.org/r/2809/diff/1/?file=57620#file57620line41>

        >

        > I left it to the subclass to do the serialization since the subclass may need to add more information to outputJobInfo before it gets serialized. Such as in HBaseBulkOutputStorageDriver where the intermediate location has to be stored in OutputJobInfo.

        I see. In that case, it will be better to add a comment in HBaseBaseOutputStorageDriver::initialize() that anyone extending this class must necessarily override initialize() and then overwrite the outputJobInfo in it. Also, patch doesn't apply cleanly, can you refresh it?

        • Ashutosh

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2809/#review3450
        -----------------------------------------------------------

        On 2011-11-11 18:24:59, Francis Liu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2809/

        -----------------------------------------------------------

        (Updated 2011-11-11 18:24:59)

        Review request for hcatalog, Sushanth Sowmyan, Vandana Ayyalasomayajula, and David Capwell.

        Summary

        -------

        HBaseDirectOutputStorageDriver missed serializing the updated OutputJobInfo, fixed that.

        This addresses bug hcatalog-160.

        https://issues.apache.org/jira/browse/hcatalog-160

        Diffs

        -----

        storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java 65dfccb

        storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseBulkOutputStorageDriver.java c25e56d

        storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputStorageDriver.java d612584

        Diff: https://reviews.apache.org/r/2809/diff

        Testing

        -------

        Updated unit tests to such a scenario and it passes now.

        Thanks,

        Francis

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-11-23 00:23:02, Francis Liu wrote: > storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java, line 41 > < https://reviews.apache.org/r/2809/diff/1/?file=57620#file57620line41 > > > I left it to the subclass to do the serialization since the subclass may need to add more information to outputJobInfo before it gets serialized. Such as in HBaseBulkOutputStorageDriver where the intermediate location has to be stored in OutputJobInfo. I see. In that case, it will be better to add a comment in HBaseBaseOutputStorageDriver::initialize() that anyone extending this class must necessarily override initialize() and then overwrite the outputJobInfo in it. Also, patch doesn't apply cleanly, can you refresh it? Ashutosh ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2809/#review3450 ----------------------------------------------------------- On 2011-11-11 18:24:59, Francis Liu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2809/ ----------------------------------------------------------- (Updated 2011-11-11 18:24:59) Review request for hcatalog, Sushanth Sowmyan, Vandana Ayyalasomayajula, and David Capwell. Summary ------- HBaseDirectOutputStorageDriver missed serializing the updated OutputJobInfo, fixed that. This addresses bug hcatalog-160. https://issues.apache.org/jira/browse/hcatalog-160 Diffs ----- storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java 65dfccb storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseBulkOutputStorageDriver.java c25e56d storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputStorageDriver.java d612584 Diff: https://reviews.apache.org/r/2809/diff Testing ------- Updated unit tests to such a scenario and it passes now. Thanks, Francis
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2809/#review3450
        -----------------------------------------------------------

        storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java
        <https://reviews.apache.org/r/2809/#comment7706>

        I left it to the subclass to do the serialization since the subclass may need to add more information to outputJobInfo before it gets serialized. Such as in HBaseBulkOutputStorageDriver where the intermediate location has to be stored in OutputJobInfo.

        • Francis

        On 2011-11-11 18:24:59, Francis Liu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2809/

        -----------------------------------------------------------

        (Updated 2011-11-11 18:24:59)

        Review request for hcatalog, Sushanth Sowmyan, Vandana Ayyalasomayajula, and David Capwell.

        Summary

        -------

        HBaseDirectOutputStorageDriver missed serializing the updated OutputJobInfo, fixed that.

        This addresses bug hcatalog-160.

        https://issues.apache.org/jira/browse/hcatalog-160

        Diffs

        -----

        storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java 65dfccb

        storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseBulkOutputStorageDriver.java c25e56d

        storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputStorageDriver.java d612584

        Diff: https://reviews.apache.org/r/2809/diff

        Testing

        -------

        Updated unit tests to such a scenario and it passes now.

        Thanks,

        Francis

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2809/#review3450 ----------------------------------------------------------- storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java < https://reviews.apache.org/r/2809/#comment7706 > I left it to the subclass to do the serialization since the subclass may need to add more information to outputJobInfo before it gets serialized. Such as in HBaseBulkOutputStorageDriver where the intermediate location has to be stored in OutputJobInfo. Francis On 2011-11-11 18:24:59, Francis Liu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2809/ ----------------------------------------------------------- (Updated 2011-11-11 18:24:59) Review request for hcatalog, Sushanth Sowmyan, Vandana Ayyalasomayajula, and David Capwell. Summary ------- HBaseDirectOutputStorageDriver missed serializing the updated OutputJobInfo, fixed that. This addresses bug hcatalog-160. https://issues.apache.org/jira/browse/hcatalog-160 Diffs ----- storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java 65dfccb storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseBulkOutputStorageDriver.java c25e56d storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputStorageDriver.java d612584 Diff: https://reviews.apache.org/r/2809/diff Testing ------- Updated unit tests to such a scenario and it passes now. Thanks, Francis
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2809/#review3449
        -----------------------------------------------------------

        storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java
        <https://reviews.apache.org/r/2809/#comment7705>

        Shouldn't this be done within super.initialize() Once the outputJobInfo is updated with revision String in HBaseBaseOutputStorageDriver, it needs to be overwritten in configuration there only.

        • Ashutosh

        On 2011-11-11 18:24:59, Francis Liu wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/2809/

        -----------------------------------------------------------

        (Updated 2011-11-11 18:24:59)

        Review request for hcatalog, Sushanth Sowmyan, Vandana Ayyalasomayajula, and David Capwell.

        Summary

        -------

        HBaseDirectOutputStorageDriver missed serializing the updated OutputJobInfo, fixed that.

        This addresses bug hcatalog-160.

        https://issues.apache.org/jira/browse/hcatalog-160

        Diffs

        -----

        storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java 65dfccb

        storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseBulkOutputStorageDriver.java c25e56d

        storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputStorageDriver.java d612584

        Diff: https://reviews.apache.org/r/2809/diff

        Testing

        -------

        Updated unit tests to such a scenario and it passes now.

        Thanks,

        Francis

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2809/#review3449 ----------------------------------------------------------- storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java < https://reviews.apache.org/r/2809/#comment7705 > Shouldn't this be done within super.initialize() Once the outputJobInfo is updated with revision String in HBaseBaseOutputStorageDriver, it needs to be overwritten in configuration there only. Ashutosh On 2011-11-11 18:24:59, Francis Liu wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2809/ ----------------------------------------------------------- (Updated 2011-11-11 18:24:59) Review request for hcatalog, Sushanth Sowmyan, Vandana Ayyalasomayajula, and David Capwell. Summary ------- HBaseDirectOutputStorageDriver missed serializing the updated OutputJobInfo, fixed that. This addresses bug hcatalog-160. https://issues.apache.org/jira/browse/hcatalog-160 Diffs ----- storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java 65dfccb storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseBulkOutputStorageDriver.java c25e56d storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputStorageDriver.java d612584 Diff: https://reviews.apache.org/r/2809/diff Testing ------- Updated unit tests to such a scenario and it passes now. Thanks, Francis
        Hide
        Francis Liu added a comment -

        svn friendly patch.

        Show
        Francis Liu added a comment - svn friendly patch.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/2809/
        -----------------------------------------------------------

        Review request for hcatalog, Sushanth Sowmyan, Vandana Ayyalasomayajula, and David Capwell.

        Summary
        -------

        HBaseDirectOutputStorageDriver missed serializing the updated OutputJobInfo, fixed that.

        This addresses bug hcatalog-160.
        https://issues.apache.org/jira/browse/hcatalog-160

        Diffs


        storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java 65dfccb
        storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseBulkOutputStorageDriver.java c25e56d
        storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputStorageDriver.java d612584

        Diff: https://reviews.apache.org/r/2809/diff

        Testing
        -------

        Updated unit tests to such a scenario and it passes now.

        Thanks,

        Francis

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2809/ ----------------------------------------------------------- Review request for hcatalog, Sushanth Sowmyan, Vandana Ayyalasomayajula, and David Capwell. Summary ------- HBaseDirectOutputStorageDriver missed serializing the updated OutputJobInfo, fixed that. This addresses bug hcatalog-160. https://issues.apache.org/jira/browse/hcatalog-160 Diffs storage-drivers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputStorageDriver.java 65dfccb storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseBulkOutputStorageDriver.java c25e56d storage-drivers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputStorageDriver.java d612584 Diff: https://reviews.apache.org/r/2809/diff Testing ------- Updated unit tests to such a scenario and it passes now. Thanks, Francis

          People

          • Assignee:
            Francis Liu
            Reporter:
            Francis Liu
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development