Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-489

Cannot define partition keys for Hive tables created through Sqoop

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.1-incubating
    • Fix Version/s: 1.4.2
    • Component/s: None
    • Labels:
      None

      Description

      By enabling the table option, Sqoop includes every column in the table in the create table query, and by enabling the hive-partition-key option, Sqoop blindly appends the "partitioned by" clause. Now if you specify one of columns in the table in the hive-partition-key, this will cause a syntax error in Hive.

      For example, if we have a table 'FOO' that has columns 'I' and 'J':

      sqoop create-hive-table --table FOO ...

      will generate the following Hive query:

      CREATE TABLE IF NOT EXISTS `FOO` ( `I` STRING, `J` STRING)

      Now if we add "--hive-partition-key I" to the command, Sqoop generates the following query:

      CREATE TABLE IF NOT EXISTS `FOO` ( `I` STRING, `J` STRING) PARTITIONED BY (I STRING)

      The problem is that since 'I' is defined twice (once in CRATE TABLE and once in PARTITIONED BY), this is a syntax error in Hive.

      This correct query would be something like:

      CREATE TABLE IF NOT EXISTS `FOO` (`J` STRING) PARTITIONED BY (I STRING)

        Attachments

        1. SQOOP-489.patch
          4 kB
          Cheolsoo Park
        2. SQOOP-489.patch
          4 kB
          Cheolsoo Park
        3. SQOOP-489.patch
          0.9 kB
          Cheolsoo Park

          Activity

            People

            • Assignee:
              cheolsoo Cheolsoo Park
              Reporter:
              kathleen Kathleen Ting
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: