Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-489

Cannot define partition keys for Hive tables created through Sqoop

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.1-incubating
    • 1.4.2
    • None
    • None

    Description

      By enabling the table option, Sqoop includes every column in the table in the create table query, and by enabling the hive-partition-key option, Sqoop blindly appends the "partitioned by" clause. Now if you specify one of columns in the table in the hive-partition-key, this will cause a syntax error in Hive.

      For example, if we have a table 'FOO' that has columns 'I' and 'J':

      sqoop create-hive-table --table FOO ...

      will generate the following Hive query:

      CREATE TABLE IF NOT EXISTS `FOO` ( `I` STRING, `J` STRING)

      Now if we add "--hive-partition-key I" to the command, Sqoop generates the following query:

      CREATE TABLE IF NOT EXISTS `FOO` ( `I` STRING, `J` STRING) PARTITIONED BY (I STRING)

      The problem is that since 'I' is defined twice (once in CRATE TABLE and once in PARTITIONED BY), this is a syntax error in Hive.

      This correct query would be something like:

      CREATE TABLE IF NOT EXISTS `FOO` (`J` STRING) PARTITIONED BY (I STRING)

      Attachments

        1. SQOOP-489.patch
          4 kB
          Cheolsoo Park
        2. SQOOP-489.patch
          4 kB
          Cheolsoo Park
        3. SQOOP-489.patch
          0.9 kB
          Cheolsoo Park

        Activity

          People

            cheolsoo Cheolsoo Park
            kathleen Kathleen Ting
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: