Sqoop
  1. Sqoop
  2. SQOOP-604

Easy throttling feature for MySQL exports

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4.2
    • Fix Version/s: 1.4.3
    • Component/s: connectors/mysql
    • Labels:
      None

      Description

      Sqoop always tries to achieve the best possible throughput with exports, which might not be desirable in all cases. Sometimes we need to export large data with Sqoop to a live relational database (MySQL in our case), that is, a database that is under a high load serving random queries from the users of our product.

      While data consistency issues during the export can be easily solved with a staging table, there is still a problem: the performance impact caused by the heavy export.

      First off, the resources of MySQL dedicated to the import process can affect the performance of the live product, both on the master and on the slaves. Second, even if the servers can handle the import with no significant performance impact (mysqlimport should be relatively "cheap"), importing big tables (GB+) can cause serious replication lag in the cluster risking data consistency.

      My suggestion is quite simple. Using the already existing "checkpoint" feature of the MySQL exports (the export process is restarted every X bytes written), extending it with a new config value that would simply make the thread sleep for X milliseconds at the checkbpoints. With low enough byte count limit this can be a simple yet powerful throttling mechanism.

      1. SQOOP-604_v6.patch
        2 kB
        Abhijeet Gaikwad
      2. SQOOP-604_v6.patch
        2 kB
        Zoltan Toth-Czifra

        Activity

        Hide
        Zoltan Toth-Czifra added a comment - - edited
        diff --git a/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java b/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        old mode 100644
        new mode 100755
        index a4e8b88..648c1e2
        --- a/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        +++ b/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        @@ -64,6 +64,18 @@ public class MySQLExportMapper<KEYIN, VALIN>
           // Configured value for MSYQL_CHECKPOINT_BYTES_KEY.
           protected long checkpointDistInBytes;
        
        +  /** Configuration key that specifies the number of milliseconds
        +   * to sleep at the end of each checkpoint commit
        +   * Default is 0, no sleep.
        +   */
        +  public static final String MYSQL_CHECKPOINT_SLEEP_KEY =
        +      "sqoop.mysql.export.sleep.ms";
        +
        +  public static final long DEFAULT_CHECKPOINT_SLEEP_MS = 0;
        +
        +  // Configured value for MYSQL_CHECKPOINT_SLEEP_KEY.
        +  protected long checkpointSleepMs;
        +
           protected Configuration conf;
        
           /** The FIFO being used to communicate with mysqlimport. */
        @@ -314,6 +326,13 @@ public class MySQLExportMapper<KEYIN, VALIN>
               LOG.warn("Invalid value for " + MYSQL_CHECKPOINT_BYTES_KEY);
               this.checkpointDistInBytes = DEFAULT_CHECKPOINT_BYTES;
             }
        +
        +    this.checkpointSleepMs = conf.getLong(
        +        MYSQL_CHECKPOINT_SLEEP_KEY, DEFAULT_CHECKPOINT_SLEEP_MS);
        +    if (this.checkpointDistInBytes < 0) {
        +      LOG.warn("Invalid value for " + MYSQL_CHECKPOINT_SLEEP_KEY);
        +      this.checkpointDistInBytes = DEFAULT_CHECKPOINT_SLEEP_MS;
        +    }
           }
        
           /**
        @@ -347,6 +366,12 @@ public class MySQLExportMapper<KEYIN, VALIN>
             if (this.checkpointDistInBytes != 0
                 && this.bytesWritten > this.checkpointDistInBytes) {
               LOG.info("Checkpointing current export.");
        +
        +      if(this.checkpointSleepMs != 0) {
        +        LOG.info("Pausing.");
        +        Thread.sleep(this.checkpointSleepMs);
        +      }
        +
               closeExportHandles();
               initMySQLImportProcess();
               this.bytesWritten = 0;
        
        Show
        Zoltan Toth-Czifra added a comment - - edited diff --git a/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java b/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java old mode 100644 new mode 100755 index a4e8b88..648c1e2 --- a/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java +++ b/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java @@ -64,6 +64,18 @@ public class MySQLExportMapper<KEYIN, VALIN> // Configured value for MSYQL_CHECKPOINT_BYTES_KEY. protected long checkpointDistInBytes; + /** Configuration key that specifies the number of milliseconds + * to sleep at the end of each checkpoint commit + * Default is 0, no sleep. + */ + public static final String MYSQL_CHECKPOINT_SLEEP_KEY = + "sqoop.mysql.export.sleep.ms" ; + + public static final long DEFAULT_CHECKPOINT_SLEEP_MS = 0; + + // Configured value for MYSQL_CHECKPOINT_SLEEP_KEY. + protected long checkpointSleepMs; + protected Configuration conf; /** The FIFO being used to communicate with mysqlimport. */ @@ -314,6 +326,13 @@ public class MySQLExportMapper<KEYIN, VALIN> LOG.warn( "Invalid value for " + MYSQL_CHECKPOINT_BYTES_KEY); this .checkpointDistInBytes = DEFAULT_CHECKPOINT_BYTES; } + + this .checkpointSleepMs = conf.getLong( + MYSQL_CHECKPOINT_SLEEP_KEY, DEFAULT_CHECKPOINT_SLEEP_MS); + if ( this .checkpointDistInBytes < 0) { + LOG.warn( "Invalid value for " + MYSQL_CHECKPOINT_SLEEP_KEY); + this .checkpointDistInBytes = DEFAULT_CHECKPOINT_SLEEP_MS; + } } /** @@ -347,6 +366,12 @@ public class MySQLExportMapper<KEYIN, VALIN> if ( this .checkpointDistInBytes != 0 && this .bytesWritten > this .checkpointDistInBytes) { LOG.info( "Checkpointing current export." ); + + if ( this .checkpointSleepMs != 0) { + LOG.info( "Pausing." ); + Thread .sleep( this .checkpointSleepMs); + } + closeExportHandles(); initMySQLImportProcess(); this .bytesWritten = 0;
        Hide
        Jarek Jarcec Cecho added a comment - - edited

        Hi Zoltan,
        you've took interesting approach. Would you mind putting your patch to our review board (https://reviews.apache.org )?

        Jarcec

        Show
        Jarek Jarcec Cecho added a comment - - edited Hi Zoltan, you've took interesting approach. Would you mind putting your patch to our review board ( https://reviews.apache.org )? Jarcec
        Hide
        Zoltan Toth-Czifra added a comment -

        Sorry, I have been busy with this so I did not have time to test.
        Here is the review:

        https://reviews.apache.org/r/7135/

        Also, results when executing with different settings of sqoop.mysql.export.checkpoint.bytes and sqoop.mysql.export.sleep.ms:

        33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec)
        102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec)
        51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec)
        51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec)
        
        Show
        Zoltan Toth-Czifra added a comment - Sorry, I have been busy with this so I did not have time to test. Here is the review: https://reviews.apache.org/r/7135/ Also, results when executing with different settings of sqoop.mysql.export.checkpoint.bytes and sqoop.mysql.export.sleep.ms: 33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec) 102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec) 51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec) 51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec)
        Hide
        Abhijeet Gaikwad added a comment -

        Attaching on behalf of Zoltan. Latest patch taken from Review board.

        Show
        Abhijeet Gaikwad added a comment - Attaching on behalf of Zoltan. Latest patch taken from Review board.
        Hide
        Abhijeet Gaikwad added a comment -

        Thanks Zoltan for your contribution.

        Show
        Abhijeet Gaikwad added a comment - Thanks Zoltan for your contribution.
        Hide
        Hudson added a comment -

        Integrated in Sqoop-ant-jdk-1.6-hadoop200 #269 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/269/)
        SQOOP-604: Easy throttling feature for MySQL exports (Revision c499f49097ebf04f9fac34f1df768a319e679cea)

        Result = SUCCESS
        abhijeet : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=c499f49097ebf04f9fac34f1df768a319e679cea
        Files :

        • src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        Show
        Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop200 #269 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/269/ ) SQOOP-604 : Easy throttling feature for MySQL exports (Revision c499f49097ebf04f9fac34f1df768a319e679cea) Result = SUCCESS abhijeet : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=c499f49097ebf04f9fac34f1df768a319e679cea Files : src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        Hide
        Hudson added a comment -

        Integrated in Sqoop-ant-jdk-1.6-hadoop100 #259 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/259/)
        SQOOP-604: Easy throttling feature for MySQL exports (Revision c499f49097ebf04f9fac34f1df768a319e679cea)

        Result = SUCCESS
        abhijeet : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=c499f49097ebf04f9fac34f1df768a319e679cea
        Files :

        • src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        Show
        Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop100 #259 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/259/ ) SQOOP-604 : Easy throttling feature for MySQL exports (Revision c499f49097ebf04f9fac34f1df768a319e679cea) Result = SUCCESS abhijeet : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=c499f49097ebf04f9fac34f1df768a319e679cea Files : src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        Hide
        Hudson added a comment -

        Integrated in Sqoop-ant-jdk-1.6-hadoop23 #414 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/414/)
        SQOOP-604: Easy throttling feature for MySQL exports (Revision c499f49097ebf04f9fac34f1df768a319e679cea)

        Result = SUCCESS
        abhijeet : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=c499f49097ebf04f9fac34f1df768a319e679cea
        Files :

        • src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        Show
        Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop23 #414 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/414/ ) SQOOP-604 : Easy throttling feature for MySQL exports (Revision c499f49097ebf04f9fac34f1df768a319e679cea) Result = SUCCESS abhijeet : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=c499f49097ebf04f9fac34f1df768a319e679cea Files : src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        Hide
        Hudson added a comment -

        Integrated in Sqoop-ant-jdk-1.6-hadoop20 #262 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/262/)
        SQOOP-604: Easy throttling feature for MySQL exports (Revision c499f49097ebf04f9fac34f1df768a319e679cea)

        Result = SUCCESS
        abhijeet : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=c499f49097ebf04f9fac34f1df768a319e679cea
        Files :

        • src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        Show
        Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop20 #262 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/262/ ) SQOOP-604 : Easy throttling feature for MySQL exports (Revision c499f49097ebf04f9fac34f1df768a319e679cea) Result = SUCCESS abhijeet : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=c499f49097ebf04f9fac34f1df768a319e679cea Files : src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
        Hide
        Zoltan Toth-Czifra added a comment -

        Thank you guys! I promise next time it will go smoother

        Show
        Zoltan Toth-Czifra added a comment - Thank you guys! I promise next time it will go smoother
        Hide
        Zoltan Toth-Czifra added a comment -

        Attaching the same patch on my own in order to explicitly give Apache Software Foundation (ASF) right to use that piece of code in Sqoop under ASF license.

        Show
        Zoltan Toth-Czifra added a comment - Attaching the same patch on my own in order to explicitly give Apache Software Foundation (ASF) right to use that piece of code in Sqoop under ASF license.

          People

          • Assignee:
            Zoltan Toth-Czifra
            Reporter:
            Zoltan Toth-Czifra
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development