Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-1341

Sqoop Export Upsert for MySQL lacks batch support

    Details

      Description

      MySQL export upserts are limited to one row per statement. This wastes bandwidth and makes the exports unusably slow.

      I wrote a patch to support multiple rows per statement at (https://github.com/cloudera/sqoop/pull/22/files). Now the records.per.statement parameter actually works, and I get fast exports (60k rows/sec) with 1000 records per statement, 10 statements per transactions, and 5 mappers.

        Activity

        Hide
        gwenshap Gwen Shapira added a comment -

        Thanks for contributing this correction! Row by row updates are indeed super slow.

        Just one question:
        Your comment says: "This assumes that the update key column is the last column serialized in by the underlying record. "
        Can you show where does the code depends on this assumption? It looked very generic to me.

        Show
        gwenshap Gwen Shapira added a comment - Thanks for contributing this correction! Row by row updates are indeed super slow. Just one question: Your comment says: "This assumes that the update key column is the last column serialized in by the underlying record. " Can you show where does the code depends on this assumption? It looked very generic to me.
        Hide
        skeltoac Andy Skelton added a comment -

        I confess I copied that whole snippet from UpdateOutputFormat.java without considering the comments.

        Show
        skeltoac Andy Skelton added a comment - I confess I copied that whole snippet from UpdateOutputFormat.java without considering the comments.
        Hide
        gwenshap Gwen Shapira added a comment -

        Thats fine. I was worried about breaking existing upsert processes. Since you didn't add a new assumption, there's no risk of breakage.

        +1 for committing.

        Show
        gwenshap Gwen Shapira added a comment - Thats fine. I was worried about breaking existing upsert processes. Since you didn't add a new assumption, there's no risk of breakage. +1 for committing.
        Hide
        jarcec Jarek Jarcec Cecho added a comment -

        Moving to patch available state, so that it will show up in our review queues.

        Show
        jarcec Jarek Jarcec Cecho added a comment - Moving to patch available state, so that it will show up in our review queues.
        Hide
        jarcec Jarek Jarcec Cecho added a comment -

        +1

        Show
        jarcec Jarek Jarcec Cecho added a comment - +1
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit d03faf3544b1ef07c9496fa7641ed6a42f58cb1d in sqoop's branch refs/heads/trunk from Jarek Jarcec Cecho
        [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=d03faf3 ]

        SQOOP-1341: Sqoop Export Upsert for MySQL lacks batch support

        (Andy Skelton via Jarek Jarcec Cecho)

        Show
        jira-bot ASF subversion and git services added a comment - Commit d03faf3544b1ef07c9496fa7641ed6a42f58cb1d in sqoop's branch refs/heads/trunk from Jarek Jarcec Cecho [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=d03faf3 ] SQOOP-1341 : Sqoop Export Upsert for MySQL lacks batch support (Andy Skelton via Jarek Jarcec Cecho)
        Hide
        jarcec Jarek Jarcec Cecho added a comment -

        Thank you for your contribution Andy Skelton!

        Show
        jarcec Jarek Jarcec Cecho added a comment - Thank you for your contribution Andy Skelton !
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop200 #894 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/894/)
        SQOOP-1341: Sqoop Export Upsert for MySQL lacks batch support (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d03faf3544b1ef07c9496fa7641ed6a42f58cb1d)

        • src/java/org/apache/sqoop/mapreduce/mysql/MySQLUpsertOutputFormat.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop200 #894 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/894/ ) SQOOP-1341 : Sqoop Export Upsert for MySQL lacks batch support (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d03faf3544b1ef07c9496fa7641ed6a42f58cb1d ) src/java/org/apache/sqoop/mapreduce/mysql/MySQLUpsertOutputFormat.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop20 #888 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/888/)
        SQOOP-1341: Sqoop Export Upsert for MySQL lacks batch support (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d03faf3544b1ef07c9496fa7641ed6a42f58cb1d)

        • src/java/org/apache/sqoop/mapreduce/mysql/MySQLUpsertOutputFormat.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop20 #888 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/888/ ) SQOOP-1341 : Sqoop Export Upsert for MySQL lacks batch support (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d03faf3544b1ef07c9496fa7641ed6a42f58cb1d ) src/java/org/apache/sqoop/mapreduce/mysql/MySQLUpsertOutputFormat.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop23 #1091 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/1091/)
        SQOOP-1341: Sqoop Export Upsert for MySQL lacks batch support (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d03faf3544b1ef07c9496fa7641ed6a42f58cb1d)

        • src/java/org/apache/sqoop/mapreduce/mysql/MySQLUpsertOutputFormat.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop23 #1091 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/1091/ ) SQOOP-1341 : Sqoop Export Upsert for MySQL lacks batch support (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d03faf3544b1ef07c9496fa7641ed6a42f58cb1d ) src/java/org/apache/sqoop/mapreduce/mysql/MySQLUpsertOutputFormat.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop100 #853 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/853/)
        SQOOP-1341: Sqoop Export Upsert for MySQL lacks batch support (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d03faf3544b1ef07c9496fa7641ed6a42f58cb1d)

        • src/java/org/apache/sqoop/mapreduce/mysql/MySQLUpsertOutputFormat.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop100 #853 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/853/ ) SQOOP-1341 : Sqoop Export Upsert for MySQL lacks batch support (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d03faf3544b1ef07c9496fa7641ed6a42f58cb1d ) src/java/org/apache/sqoop/mapreduce/mysql/MySQLUpsertOutputFormat.java

          People

          • Assignee:
            skeltoac Andy Skelton
            Reporter:
            skeltoac Andy Skelton
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 1h
              1h
              Remaining:
              Remaining Estimate - 1h
              1h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development