Sqoop
  1. Sqoop
  2. SQOOP-314

Basic export hangs when target database does not support INSERT syntax with multiple rows of values

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0-incubating
    • Component/s: None
    • Labels:
      None

      Description

      Basic export job will hang when the target database does not support INSERT syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24)

      This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath.

      The configuration variable "sqoop.export.records.per.statement" can be set to 1 as a workaround for this problem.

      1. SQOOP-314.diff
        14 kB
        Bilung Lee

        Activity

        Hide
        Hudson added a comment -

        Integrated in Sqoop-jdk-1.6 #14 (See https://builds.apache.org/job/Sqoop-jdk-1.6/14/)
        SQOOP-314. Support for batch insert.

        (Bilung Lee via Arvind Prabhakar)

        arvind : http://svn.apache.org/viewvc/?view=rev&rev=1159773
        Files :

        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/manager/SQLServerManager.java
        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java
        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/SqoopOptions.java
        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/manager/OracleManager.java
        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java
        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java
        • /incubator/sqoop/trunk/src/docs/user/export.txt
        • /incubator/sqoop/trunk/src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java
        • /incubator/sqoop/trunk/src/docs/man/sqoop-export.txt
        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java
        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java
        • /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/tool/ExportTool.java
        Show
        Hudson added a comment - Integrated in Sqoop-jdk-1.6 #14 (See https://builds.apache.org/job/Sqoop-jdk-1.6/14/ ) SQOOP-314 . Support for batch insert. (Bilung Lee via Arvind Prabhakar) arvind : http://svn.apache.org/viewvc/?view=rev&rev=1159773 Files : /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/manager/SQLServerManager.java /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/SqoopOptions.java /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/manager/OracleManager.java /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java /incubator/sqoop/trunk/src/docs/user/export.txt /incubator/sqoop/trunk/src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java /incubator/sqoop/trunk/src/docs/man/sqoop-export.txt /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java /incubator/sqoop/trunk/src/java/com/cloudera/sqoop/tool/ExportTool.java
        Hide
        Arvind Prabhakar added a comment -

        Patch committed. Thanks Bilung!

        Show
        Arvind Prabhakar added a comment - Patch committed. Thanks Bilung!
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-08-19 18:02:13, Arvind Prabhakar wrote:

        > +1

        >

        > Can you please rebase the patch and attach it to the Jira? Thanks!

        Yes, just did.

        • Bilung

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1585/#review1553
        -----------------------------------------------------------

        On 2011-08-19 00:49:12, Bilung Lee wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/1585/

        -----------------------------------------------------------

        (Updated 2011-08-19 00:49:12)

        Review request for Sqoop, Arvind Prabhakar and jmhsieh.

        Summary

        -------

        Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24)

        This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath.

        The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement.

        This addresses bug SQOOP-314.

        https://issues.apache.org/jira/browse/SQOOP-314

        Diffs

        -----

        src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6

        src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1

        src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091

        src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c

        src/docs/man/sqoop-export.txt 6090ea1

        src/docs/user/export.txt 4f87886

        src/java/com/cloudera/sqoop/SqoopOptions.java d760d39

        src/java/com/cloudera/sqoop/manager/OracleManager.java 6a55312

        src/java/com/cloudera/sqoop/manager/SQLServerManager.java e1ce2af

        src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41

        src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION

        src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37

        Diff: https://reviews.apache.org/r/1585/diff

        Testing

        -------

        Thanks,

        Bilung

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-08-19 18:02:13, Arvind Prabhakar wrote: > +1 > > Can you please rebase the patch and attach it to the Jira? Thanks! Yes, just did. Bilung ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1585/#review1553 ----------------------------------------------------------- On 2011-08-19 00:49:12, Bilung Lee wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1585/ ----------------------------------------------------------- (Updated 2011-08-19 00:49:12) Review request for Sqoop, Arvind Prabhakar and jmhsieh. Summary ------- Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24) This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath. The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement. This addresses bug SQOOP-314 . https://issues.apache.org/jira/browse/SQOOP-314 Diffs ----- src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6 src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1 src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091 src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c src/docs/man/sqoop-export.txt 6090ea1 src/docs/user/export.txt 4f87886 src/java/com/cloudera/sqoop/SqoopOptions.java d760d39 src/java/com/cloudera/sqoop/manager/OracleManager.java 6a55312 src/java/com/cloudera/sqoop/manager/SQLServerManager.java e1ce2af src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41 src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37 Diff: https://reviews.apache.org/r/1585/diff Testing ------- Thanks, Bilung
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1585/#review1553
        -----------------------------------------------------------

        Ship it!

        +1

        Can you please rebase the patch and attach it to the Jira? Thanks!

        • Arvind

        On 2011-08-19 00:49:12, Bilung Lee wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/1585/

        -----------------------------------------------------------

        (Updated 2011-08-19 00:49:12)

        Review request for Sqoop, Arvind Prabhakar and jmhsieh.

        Summary

        -------

        Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24)

        This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath.

        The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement.

        This addresses bug SQOOP-314.

        https://issues.apache.org/jira/browse/SQOOP-314

        Diffs

        -----

        src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6

        src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1

        src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091

        src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c

        src/docs/man/sqoop-export.txt 6090ea1

        src/docs/user/export.txt 4f87886

        src/java/com/cloudera/sqoop/SqoopOptions.java d760d39

        src/java/com/cloudera/sqoop/manager/OracleManager.java 6a55312

        src/java/com/cloudera/sqoop/manager/SQLServerManager.java e1ce2af

        src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41

        src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION

        src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37

        Diff: https://reviews.apache.org/r/1585/diff

        Testing

        -------

        Thanks,

        Bilung

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1585/#review1553 ----------------------------------------------------------- Ship it! +1 Can you please rebase the patch and attach it to the Jira? Thanks! Arvind On 2011-08-19 00:49:12, Bilung Lee wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1585/ ----------------------------------------------------------- (Updated 2011-08-19 00:49:12) Review request for Sqoop, Arvind Prabhakar and jmhsieh. Summary ------- Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24) This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath. The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement. This addresses bug SQOOP-314 . https://issues.apache.org/jira/browse/SQOOP-314 Diffs ----- src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6 src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1 src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091 src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c src/docs/man/sqoop-export.txt 6090ea1 src/docs/user/export.txt 4f87886 src/java/com/cloudera/sqoop/SqoopOptions.java d760d39 src/java/com/cloudera/sqoop/manager/OracleManager.java 6a55312 src/java/com/cloudera/sqoop/manager/SQLServerManager.java e1ce2af src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41 src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37 Diff: https://reviews.apache.org/r/1585/diff Testing ------- Thanks, Bilung
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1585/
        -----------------------------------------------------------

        (Updated 2011-08-19 00:49:12.862727)

        Review request for Sqoop, Arvind Prabhakar and jmhsieh.

        Changes
        -------

        New patch that incorporates comments.

        Summary
        -------

        Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24)

        This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath.

        The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement.

        This addresses bug SQOOP-314.
        https://issues.apache.org/jira/browse/SQOOP-314

        Diffs (updated)


        src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6
        src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1
        src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091
        src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c
        src/docs/man/sqoop-export.txt 6090ea1
        src/docs/user/export.txt 4f87886
        src/java/com/cloudera/sqoop/SqoopOptions.java d760d39
        src/java/com/cloudera/sqoop/manager/OracleManager.java 6a55312
        src/java/com/cloudera/sqoop/manager/SQLServerManager.java e1ce2af
        src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41
        src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION
        src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37

        Diff: https://reviews.apache.org/r/1585/diff

        Testing
        -------

        Thanks,

        Bilung

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1585/ ----------------------------------------------------------- (Updated 2011-08-19 00:49:12.862727) Review request for Sqoop, Arvind Prabhakar and jmhsieh. Changes ------- New patch that incorporates comments. Summary ------- Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24) This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath. The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement. This addresses bug SQOOP-314 . https://issues.apache.org/jira/browse/SQOOP-314 Diffs (updated) src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6 src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1 src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091 src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c src/docs/man/sqoop-export.txt 6090ea1 src/docs/user/export.txt 4f87886 src/java/com/cloudera/sqoop/SqoopOptions.java d760d39 src/java/com/cloudera/sqoop/manager/OracleManager.java 6a55312 src/java/com/cloudera/sqoop/manager/SQLServerManager.java e1ce2af src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41 src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37 Diff: https://reviews.apache.org/r/1585/diff Testing ------- Thanks, Bilung
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-08-18 22:35:36, Arvind Prabhakar wrote:

        > Changes look good Bilung. One high-level suggestion is to modify OracleManager and SQLServerManager to use the new ExportBatchOutputFormat. The classes for OracleExportOutputFormat and SQLSErverExportOutputFormat can then be deprecated.

        Done in the new patch. Thanks.

        On 2011-08-18 22:35:36, Arvind Prabhakar wrote:

        > src/java/com/cloudera/sqoop/SqoopOptions.java, line 1026

        > <https://reviews.apache.org/r/1585/diff/1/?file=33455#file33455line1026>

        >

        > This will trigger a checkstyle warning. Should change the argument name to mode or something else.

        Done in the new patch. Thanks.

        On 2011-08-18 22:35:36, Arvind Prabhakar wrote:

        > src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java, line 41

        > <https://reviews.apache.org/r/1585/diff/1/?file=33457#file33457line41>

        >

        > Longer than 80.

        Done in the new patch. Thanks.

        On 2011-08-18 22:35:36, Arvind Prabhakar wrote:

        > src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java, line 179

        > <https://reviews.apache.org/r/1585/diff/1/?file=33456#file33456line179>

        >

        > Thread.stop() is unsafe operation that can lead to corruption. Would it help if the join call was part of the try block instead of in finally? In that case, throwing the IOException would automatically get rid of the thread.

        Done in the new patch. Thanks.

        • Bilung

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1585/#review1531
        -----------------------------------------------------------

        On 2011-08-19 00:49:12, Bilung Lee wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/1585/

        -----------------------------------------------------------

        (Updated 2011-08-19 00:49:12)

        Review request for Sqoop, Arvind Prabhakar and jmhsieh.

        Summary

        -------

        Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24)

        This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath.

        The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement.

        This addresses bug SQOOP-314.

        https://issues.apache.org/jira/browse/SQOOP-314

        Diffs

        -----

        src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6

        src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1

        src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091

        src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c

        src/docs/man/sqoop-export.txt 6090ea1

        src/docs/user/export.txt 4f87886

        src/java/com/cloudera/sqoop/SqoopOptions.java d760d39

        src/java/com/cloudera/sqoop/manager/OracleManager.java 6a55312

        src/java/com/cloudera/sqoop/manager/SQLServerManager.java e1ce2af

        src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41

        src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION

        src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37

        Diff: https://reviews.apache.org/r/1585/diff

        Testing

        -------

        Thanks,

        Bilung

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-08-18 22:35:36, Arvind Prabhakar wrote: > Changes look good Bilung. One high-level suggestion is to modify OracleManager and SQLServerManager to use the new ExportBatchOutputFormat. The classes for OracleExportOutputFormat and SQLSErverExportOutputFormat can then be deprecated. Done in the new patch. Thanks. On 2011-08-18 22:35:36, Arvind Prabhakar wrote: > src/java/com/cloudera/sqoop/SqoopOptions.java, line 1026 > < https://reviews.apache.org/r/1585/diff/1/?file=33455#file33455line1026 > > > This will trigger a checkstyle warning. Should change the argument name to mode or something else. Done in the new patch. Thanks. On 2011-08-18 22:35:36, Arvind Prabhakar wrote: > src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java, line 41 > < https://reviews.apache.org/r/1585/diff/1/?file=33457#file33457line41 > > > Longer than 80. Done in the new patch. Thanks. On 2011-08-18 22:35:36, Arvind Prabhakar wrote: > src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java, line 179 > < https://reviews.apache.org/r/1585/diff/1/?file=33456#file33456line179 > > > Thread.stop() is unsafe operation that can lead to corruption. Would it help if the join call was part of the try block instead of in finally? In that case, throwing the IOException would automatically get rid of the thread. Done in the new patch. Thanks. Bilung ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1585/#review1531 ----------------------------------------------------------- On 2011-08-19 00:49:12, Bilung Lee wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1585/ ----------------------------------------------------------- (Updated 2011-08-19 00:49:12) Review request for Sqoop, Arvind Prabhakar and jmhsieh. Summary ------- Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24) This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath. The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement. This addresses bug SQOOP-314 . https://issues.apache.org/jira/browse/SQOOP-314 Diffs ----- src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6 src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1 src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091 src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c src/docs/man/sqoop-export.txt 6090ea1 src/docs/user/export.txt 4f87886 src/java/com/cloudera/sqoop/SqoopOptions.java d760d39 src/java/com/cloudera/sqoop/manager/OracleManager.java 6a55312 src/java/com/cloudera/sqoop/manager/SQLServerManager.java e1ce2af src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41 src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37 Diff: https://reviews.apache.org/r/1585/diff Testing ------- Thanks, Bilung
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1585/#review1531
        -----------------------------------------------------------

        Changes look good Bilung. One high-level suggestion is to modify OracleManager and SQLServerManager to use the new ExportBatchOutputFormat. The classes for OracleExportOutputFormat and SQLSErverExportOutputFormat can then be deprecated.

        src/java/com/cloudera/sqoop/SqoopOptions.java
        <https://reviews.apache.org/r/1585/#comment3483>

        This will trigger a checkstyle warning. Should change the argument name to mode or something else.

        src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java
        <https://reviews.apache.org/r/1585/#comment3486>

        Thread.stop() is unsafe operation that can lead to corruption. Would it help if the join call was part of the try block instead of in finally? In that case, throwing the IOException would automatically get rid of the thread.

        src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java
        <https://reviews.apache.org/r/1585/#comment3484>

        Longer than 80.

        • Arvind

        On 2011-08-18 17:46:40, Bilung Lee wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/1585/

        -----------------------------------------------------------

        (Updated 2011-08-18 17:46:40)

        Review request for Sqoop, Arvind Prabhakar and jmhsieh.

        Summary

        -------

        Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24)

        This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath.

        The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement.

        This addresses bug SQOOP-314.

        https://issues.apache.org/jira/browse/SQOOP-314

        Diffs

        -----

        src/docs/man/sqoop-export.txt 6090ea1

        src/docs/user/export.txt 4f87886

        src/java/com/cloudera/sqoop/SqoopOptions.java d760d39

        src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41

        src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION

        src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37

        src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6

        src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1

        src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091

        src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c

        Diff: https://reviews.apache.org/r/1585/diff

        Testing

        -------

        Thanks,

        Bilung

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1585/#review1531 ----------------------------------------------------------- Changes look good Bilung. One high-level suggestion is to modify OracleManager and SQLServerManager to use the new ExportBatchOutputFormat. The classes for OracleExportOutputFormat and SQLSErverExportOutputFormat can then be deprecated. src/java/com/cloudera/sqoop/SqoopOptions.java < https://reviews.apache.org/r/1585/#comment3483 > This will trigger a checkstyle warning. Should change the argument name to mode or something else. src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java < https://reviews.apache.org/r/1585/#comment3486 > Thread.stop() is unsafe operation that can lead to corruption. Would it help if the join call was part of the try block instead of in finally? In that case, throwing the IOException would automatically get rid of the thread. src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java < https://reviews.apache.org/r/1585/#comment3484 > Longer than 80. Arvind On 2011-08-18 17:46:40, Bilung Lee wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1585/ ----------------------------------------------------------- (Updated 2011-08-18 17:46:40) Review request for Sqoop, Arvind Prabhakar and jmhsieh. Summary ------- Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24) This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath. The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement. This addresses bug SQOOP-314 . https://issues.apache.org/jira/browse/SQOOP-314 Diffs ----- src/docs/man/sqoop-export.txt 6090ea1 src/docs/user/export.txt 4f87886 src/java/com/cloudera/sqoop/SqoopOptions.java d760d39 src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41 src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37 src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6 src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1 src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091 src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c Diff: https://reviews.apache.org/r/1585/diff Testing ------- Thanks, Bilung
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/1585/
        -----------------------------------------------------------

        (Updated 2011-08-18 17:46:40.735309)

        Review request for Sqoop, Arvind Prabhakar and jmhsieh.

        Summary
        -------

        Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24)

        This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath.

        The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement.

        This addresses bug SQOOP-314.
        https://issues.apache.org/jira/browse/SQOOP-314

        Diffs


        src/docs/man/sqoop-export.txt 6090ea1
        src/docs/user/export.txt 4f87886
        src/java/com/cloudera/sqoop/SqoopOptions.java d760d39
        src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41
        src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION
        src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37
        src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6
        src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1
        src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091
        src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c

        Diff: https://reviews.apache.org/r/1585/diff

        Testing
        -------

        Thanks,

        Bilung

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1585/ ----------------------------------------------------------- (Updated 2011-08-18 17:46:40.735309) Review request for Sqoop, Arvind Prabhakar and jmhsieh. Summary ------- Basic export job will hang when the target database does not support insert syntax with multiple rows of values, such as INSERT INTO tbl (col1, col2) VALUES(11, 12),(21, 22),(23, 24) This is because, in close(), AsyncSqlRecordWriter will still wait for AsyncSqlExecThread to finish even when an SQLException is thrown underneath. The fix is to introduce a new "--batch" option to execute underlying statements in batch mode instead of a single multirow insert statement. This addresses bug SQOOP-314 . https://issues.apache.org/jira/browse/SQOOP-314 Diffs src/docs/man/sqoop-export.txt 6090ea1 src/docs/user/export.txt 4f87886 src/java/com/cloudera/sqoop/SqoopOptions.java d760d39 src/java/com/cloudera/sqoop/mapreduce/AsyncSqlRecordWriter.java 193cf41 src/java/com/cloudera/sqoop/mapreduce/ExportBatchOutputFormat.java PRE-CREATION src/java/com/cloudera/sqoop/mapreduce/ExportJobBase.java 9799e37 src/java/com/cloudera/sqoop/mapreduce/ExportOutputFormat.java d2a6cf6 src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1 src/java/com/cloudera/sqoop/tool/ExportTool.java b4b1091 src/test/com/cloudera/sqoop/manager/JdbcMySQLExportTest.java 8687b0c Diff: https://reviews.apache.org/r/1585/diff Testing ------- Thanks, Bilung

          People

          • Assignee:
            Bilung Lee
            Reporter:
            Bilung Lee
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development