Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1784

Show create table on HBase tables is inconsistent with Hive

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • Impala 2.1.1
    • Product Backlog
    • None

    Description

      If you run a "show create table" on an HBase table in Impala, the column names are displayed in a different order than in Hive. This is a problem if you run a show create table from Impala, and then run the create table command in Hive, because the ordering of the columns is very important, as it needs to align with the "hbase.columns.mapping" serde property.

      Example.

      Correct (hive)

      CREATE EXTERNAL TABLE `jira_2`(
        `hbasekey` string COMMENT 'from deserializer',
        `project` string COMMENT 'from deserializer',
        `issueid` string COMMENT 'from deserializer',
        `title` string COMMENT 'from deserializer',
        `summary` string COMMENT 'from deserializer',
        `createdts` bigint COMMENT 'from deserializer',
        `updatedts` bigint COMMENT 'from deserializer',
        `issuetype` string COMMENT 'from deserializer',
        `priority` string COMMENT 'from deserializer',
        `resolution` string COMMENT 'from deserializer',
        `affectsversion` string COMMENT 'from deserializer',
        `fixversion` string COMMENT 'from deserializer',
        `component` string COMMENT 'from deserializer',
        `clouderaflags` string COMMENT 'from deserializer',
        `status` string COMMENT 'from deserializer',
        `assignee` string COMMENT 'from deserializer',
        `reporter` string COMMENT 'from deserializer',
        `labels` string COMMENT 'from deserializer')
      ROW FORMAT SERDE
        'org.apache.hadoop.hive.hbase.HBaseSerDe'
      STORED BY
        'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
      WITH SERDEPROPERTIES (
        'serialization.format'='1',
        'hbase.columns.mapping'=':key, rep:project, rep:issue_id, content:title, content:summary, meta:created#b, meta:updated#b, rep:type, rep:priority, rep:resolution, meta:version, meta:fix_version, meta:component, meta:cloudera_flags, rep:status, rep:assignee, rep:reporter, meta:labelsStr')
      LOCATION
        'hdfs://nameservice1/user/hive/warehouse/jira'
      TBLPROPERTIES (
        'hbase.table.name'='jira_ticket',
        'transient_lastDdlTime'='1405092795')
      

      Incorrect (Impala)

      CREATE EXTERNAL TABLE default.jira_2 (
        hbasekey STRING,
        summary STRING,
        title STRING,
        clouderaflags STRING,
        component STRING,
        createdts BIGINT,
        fixversion STRING,
        updatedts BIGINT,
        affectsversion STRING,
        assignee STRING,
        issueid STRING,
        priority STRING,
        project STRING,
        reporter STRING,
        resolution STRING,
        status STRING,
        issuetype STRING,
        labels STRING
      )
      STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
      WITH SERDEPROPERTIES ('hbase.columns.mapping'=':key, rep:project, rep:issue_id, content:title, content:summary, meta:created#b, meta:updated#b, rep:type, rep:priority, rep:resolution, meta:version, meta:fix_version, meta:component, meta:cloudera_flags, rep:status, rep:assignee, rep:reporter, meta:labelsStr', 'serialization.format'='1')
      TBLPROPERTIES ('hbase.table.name'='jira_ticket', 'transient_lastDdlTime'='1405092795', 'storage_handler'='org.apache.hadoop.hive.hbase.HBaseStorageHandler')
      

      The workaround is to simply not use Impala show create table for HBase tables, but we should probably get this fixed at some point.

      Attachments

        Activity

          People

            mgrund_impala_bb91 Martin Grund
            rickysaltzer Ricky Saltzer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: