Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-539

getPrimaryKeyQuery is very slow in mysql 5.0.77

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4.1-incubating
    • Fix Version/s: 1.4.2
    • Component/s: connectors/mysql
    • Labels:
    • Environment:

      mysql 5.0.77-percona-highperfb0805

      Description

      The current query used to get the primary key for a table takes 20 seconds in my production environment and 11 minutes in my QA environment. I simplified it and it now runs very fast (0.08 seconds in my production environment). I'm submitting the patch I applied locally.

      1. patch
        1 kB
        Jhovanny
      2. patch
        0.8 kB
        Jhovanny
      3. patch
        0.8 kB
        Jhovanny

        Activity

        Hide
        jhovanny Jhovanny added a comment -

        Thanks to Abhijeet and Jarek for such a prompt response. Happy to help.

        Show
        jhovanny Jhovanny added a comment - Thanks to Abhijeet and Jarek for such a prompt response. Happy to help.
        Hide
        hudson Hudson added a comment -

        Integrated in Sqoop-ant-jdk-1.6-hadoop23 #137 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/137/)
        SQOOP-539. getPrimaryKeyQuery is very slow in mysql 5.0.77.

        (Jhovanny via Jarek Jarcec Cecho) (Revision 1370124)

        Result = SUCCESS
        jarcec :
        Files :

        • /sqoop/trunk/src/java/org/apache/sqoop/manager/MySQLManager.java
        Show
        hudson Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop23 #137 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/137/ ) SQOOP-539 . getPrimaryKeyQuery is very slow in mysql 5.0.77. (Jhovanny via Jarek Jarcec Cecho) (Revision 1370124) Result = SUCCESS jarcec : Files : /sqoop/trunk/src/java/org/apache/sqoop/manager/MySQLManager.java
        Hide
        hudson Hudson added a comment -

        Integrated in Sqoop-ant-jdk-1.6-hadoop100 #5 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/5/)
        SQOOP-539. getPrimaryKeyQuery is very slow in mysql 5.0.77.

        (Jhovanny via Jarek Jarcec Cecho) (Revision 1370124)

        Result = FAILURE
        jarcec :
        Files :

        • /sqoop/trunk/src/java/org/apache/sqoop/manager/MySQLManager.java
        Show
        hudson Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop100 #5 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/5/ ) SQOOP-539 . getPrimaryKeyQuery is very slow in mysql 5.0.77. (Jhovanny via Jarek Jarcec Cecho) (Revision 1370124) Result = FAILURE jarcec : Files : /sqoop/trunk/src/java/org/apache/sqoop/manager/MySQLManager.java
        Hide
        hudson Hudson added a comment -

        Integrated in Sqoop-ant-jdk-1.6-hadoop20 #6 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/6/)
        SQOOP-539. getPrimaryKeyQuery is very slow in mysql 5.0.77.

        (Jhovanny via Jarek Jarcec Cecho) (Revision 1370124)

        Result = FAILURE
        jarcec :
        Files :

        • /sqoop/trunk/src/java/org/apache/sqoop/manager/MySQLManager.java
        Show
        hudson Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop20 #6 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/6/ ) SQOOP-539 . getPrimaryKeyQuery is very slow in mysql 5.0.77. (Jhovanny via Jarek Jarcec Cecho) (Revision 1370124) Result = FAILURE jarcec : Files : /sqoop/trunk/src/java/org/apache/sqoop/manager/MySQLManager.java
        Hide
        jarcec Jarek Jarcec Cecho added a comment -

        Committed revision 1370124.

        Thanks Jhovanny for your contribution!

        Jarcec

        Show
        jarcec Jarek Jarcec Cecho added a comment - Committed revision 1370124. Thanks Jhovanny for your contribution! Jarcec
        Hide
        jarcec Jarek Jarcec Cecho added a comment -

        Thanks Abhijeet for testing and Jhovany for the patch. I do not have any issues with it, so I'm giving +1 and I'll commit it shortly.

        Show
        jarcec Jarek Jarcec Cecho added a comment - Thanks Abhijeet for testing and Jhovany for the patch. I do not have any issues with it, so I'm giving +1 and I'll commit it shortly.
        Hide
        abhijeet_gaikwad Abhijeet Gaikwad added a comment -

        Just for committers info:
        Also verified that the query works successfully on mysql command-line and via Sqoop.

        Thanks Jhovanny for the patch.

        Show
        abhijeet_gaikwad Abhijeet Gaikwad added a comment - Just for committers info: Also verified that the query works successfully on mysql command-line and via Sqoop. Thanks Jhovanny for the patch.
        Hide
        abhijeet_gaikwad Abhijeet Gaikwad added a comment -

        Patch looks good to me now.
        As this is just a query change at one place, I feel we can bypass review board.

        For me,
        ant test -Dhadoopversion=100 [success]
        ant checkstyle [no errors]

        In all -
        +1 from my side

        But we need to wait till one of the committers approves this.

        Show
        abhijeet_gaikwad Abhijeet Gaikwad added a comment - Patch looks good to me now. As this is just a query change at one place, I feel we can bypass review board. For me, ant test -Dhadoopversion=100 [success] ant checkstyle [no errors] In all - +1 from my side But we need to wait till one of the committers approves this.
        Hide
        jhovanny Jhovanny added a comment -

        Thanks Abhijeet. Per your suggestion, I moved the code down to MySqlManager and tested it both with --direct and without. It still works properly. I'm attaching the updated patch file.

        Show
        jhovanny Jhovanny added a comment - Thanks Abhijeet. Per your suggestion, I moved the code down to MySqlManager and tested it both with --direct and without. It still works properly. I'm attaching the updated patch file.
        Hide
        abhijeet_gaikwad Abhijeet Gaikwad added a comment -

        In Sqoop InformationSchemaManager class is a super of two managers namely - SQLServerManager and MySqlManager.
        MySql contains column COLUMN_KEY in INFORMATION_SCHEMA.COLUMNS (http://dev.mysql.com/doc/refman/5.1/en/columns-table.html), but it seems SqlServer does not support this column (http://msdn.microsoft.com/en-us/library/ms188348%28v=sql.105%29.aspx). So this change will fail the SqlServer flow.

        What I propose is to make this change specific to Mysql by overriding getPrimaryKeyQuery() in MySqlManger class. Let me know.

        Show
        abhijeet_gaikwad Abhijeet Gaikwad added a comment - In Sqoop InformationSchemaManager class is a super of two managers namely - SQLServerManager and MySqlManager. MySql contains column COLUMN_KEY in INFORMATION_SCHEMA.COLUMNS ( http://dev.mysql.com/doc/refman/5.1/en/columns-table.html ), but it seems SqlServer does not support this column ( http://msdn.microsoft.com/en-us/library/ms188348%28v=sql.105%29.aspx ). So this change will fail the SqlServer flow. What I propose is to make this change specific to Mysql by overriding getPrimaryKeyQuery() in MySqlManger class. Let me know.
        Hide
        jhovanny Jhovanny added a comment -

        Patch for query change

        Show
        jhovanny Jhovanny added a comment - Patch for query change

          People

          • Assignee:
            jhovanny Jhovanny
            Reporter:
            jhovanny Jhovanny
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development