Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-3456

Sqoop import from Hive to MySQL fails for Emoitcons

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Blocker
    • Resolution: Unresolved
    • 1.4.6
    • None
    • connectors/mysql
    • None
    • Production: 

      Part of CloudEra

       

    • Patch, Important

    Description

      We are facing an issue with inserting emoticons into Mysql via sqoop import from Hive. Here is the full description of the problem. Please can we get some urgent help.

       ** 

      Error Message:

      Caused by: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x9A\x97 \xF0...' for column 'Description' at row

       

      Script:

      nohup sqoop export -D mapreduce.job.queuename="*****" --connect "jdbc:mysql://xxx.mysql.database.azure.com/xxx" --username 'mysqladmin@xxx-mysql-db10' --password '*******' --table workorder_emojitest --hcatalog-database test_db -hcatalog-table workorder_test_1025 --input-fields-terminated-by , --driver com.mysql.jdbc.Driver >> /tmp/workorder_em.logs &

       

      Options tried so far (none of them have worked)

       

      Option 1 - not successful

      A SerDe allows Hive to read in data from a table, and write it back out to HDFS in any custom format.

      We have installed the serde jars and kepts in hive/lib location for emojis data support But it's not worked. and also we kept this serde jars in in Sqoop/lib but no luck.

       

      Option 2 - not successful

      In mySql server the emojis data types we are using var char with serde properties utf8mb4

      In Hive   server the emojis data types are strings with serde properties  utf-8

      Manually we are able to insert in my SQL server and hive server based on above properties.

      Note: In MySQL server serde properties which are using  utf8mb4 is 4bytes

                 *           In hive server serde properties which are using  utf8 is *3bytes

      we have tried all the way to change  utf8mb4 to utf8 in  MySQL server but it was not supporting.

      and we changed UTF-8 properties to UFT-16 in hive server and checked but no luck.

      also we have modified every properties in sqoop export command and tried.

      jdbc:mysql://localhost:3306/?useUnicode=yes&characterEncoding=UTF-8

      default-character-set=utf8  character-set=utf8mb4

      collation-server=utf8mb4_unicode_ci

       

      Option 3 - not successful

      We also we tried all the possible ways on MySQL side and changed MySQL properties and restarted the server.

      character_set_server=utf8mb4character_set_server=utf-8character_set_connection utf-8 character-set results utf-8

      Attachments

        Activity

          People

            Unassigned Unassigned
            SNPARVATIKAR Sudhinder
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: