Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-2639

Unable to export utf-8 data to MySQL using --direct mode

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.4.6
    • None
    • connectors/mysql
    • None

    Description

      I am able to import utf-8 data (non-latin1) data successfully into HDFS via:

      sqoop import --connect jdbc:mysql://host/db --username XX --password YY \
      --mysql-delimiters \
      --table MYSQL_SRC_TABLE --target-dir ${SQOOP_DIR_PREFIX}/mysql_table --direct

      However, using

      sqoop export --connect jdbc:mysql://host/db --username XX --password YY \
      --mysql-delimiters \
      --table MYSQL_DEST_TABLE --export-dir ${SQOOP_DIR_PREFIX}/mysql_table \
      --direct

      Cuts off the fields after the first non-latin1 character (eg a letter w/ an umlaut).
      I tried other options like – --default-character-set=utf8, without success.

      I was able to fix the problem with the following change:
      Change https://svn.apache.org/repos/asf/sqoop/trunk/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java, line 322 from
      this.mysqlCharSet = MySQLUtils.MYSQL_DEFAULT_CHARSET;
      to
      this.mysqlCharSet = "utf-8";

      Hope this helps

      Attachments

        1. sqoop-2639.patch
          8 kB
          DaeMyung Kang

        Activity

          People

            charsyam DaeMyung Kang
            rbagchi Ranjan Bagchi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: