Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Kudu_Impala
    • Fix Version/s: Impala 2.8.0
    • Component/s: Frontend
    • Labels:

      Description

      The current syntax is very verbose and has non-standard table properties that the user needs to lookup in the documentation. Instead Impala should try to use standard sql for table creation.

      Current syntax

      CREATE TABLE <Hive metastore name>
      [...]
      TBLPROPERTIES(
      'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler',
      'kudu.table_name' = <Kudu name>,
      'kudu.master_addresses' = <addrs>,
      'kudu.key_columns' = <col, …>,
      [‘kudu.num_tablet_replicas’ = <num>])
      

      Proposed syntax

      1) Add PRIMARY KEY and STORED AS KUDU

      CREATE TABLE ...
                  [(<col>, …, PRIMARY KEY (<col>, …))] | [(<col> PRIMARY KEY, …)]
      STORED AS KUDU
      

      2) Add Impala (catalogd?) startup flag to list the default Kudu master addresses

      -default_kudu_master_addresses

      What actually happens

      CREATE TABLE <Hive metastore name>
      [...]
      TBLPROPERTIES (
      'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler',
      'kudu.table_name' = [<Hive metastore db name>.]<Hive metastore name>,
      'kudu.master_addresses' = <FLAG_default_kudu_master_addresses>,
      'kudu.key_columns' = <PRIMARY KEY cols>)
      

      The default TBLPROPERTIES are set and the user can still override them.

      Examples

      Old way

      CREATE TABLE foo (key INT, value STRING)
      TBLPROPERTIES(
      'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler',
      'kudu.table_name' = ‘foo’,
      'kudu.master_addresses' = ‘127.0.0.1’,
      'kudu.key_columns' = ‘key’)
      

      New way

      CREATE TABLE foo (key INT PRIMARY KEY, value STRING)
      STORED AS KUDU
      

      Override an option

      CREATE TABLE foo (key INT PRIMARY KEY, value STRING)
      STORED AS KUDU
      TBLPROPERTIES('kudu.table_name' = ‘bar’)
      

      Composite Key

      CREATE TABLE foo (key_1 INT, key_2 INT, value STRING, PRIMARY KEY(key_1, key_2))
      STORED AS KUDU
      

        Activity

        Hide
        wdberkeley_impala_f7d4 Will Berkeley added a comment -

        +1 to the idea of this JIRA, but I prefer the following syntax (borrowed from MySQL):

        CREATE TABLE myKuduTable (
            key1 INT,
            key2 STRING,
            PRIMARY KEY (key1, key2)
        )
        STORED AS KUDU
        

        The reasoning is that neither key1 nor key2 are keys. The key is the ordered pair (key1, key2), and I think this syntax makes it much clearer. Also, I think column order is not supposed to be significant in CREATE TABLE name (column defs), but the proposed syntax makes it so in the case of multiple key columns.

        Show
        wdberkeley_impala_f7d4 Will Berkeley added a comment - +1 to the idea of this JIRA, but I prefer the following syntax (borrowed from MySQL): CREATE TABLE myKuduTable ( key1 INT, key2 STRING, PRIMARY KEY (key1, key2) ) STORED AS KUDU The reasoning is that neither key1 nor key2 are keys. The key is the ordered pair (key1, key2) , and I think this syntax makes it much clearer. Also, I think column order is not supposed to be significant in CREATE TABLE name (column defs) , but the proposed syntax makes it so in the case of multiple key columns.
        Hide
        caseyc casey added a comment -

        Hi Will, thanks for looking at this. I'm planning on supporting both

        CREATE TABLE t (k INT PRIMARY KEY) STORED AS KUDU
        

        and

        CREATE TABLE t (k1 INT, k2 STRING, PRIMARY KEY(k1, k2)) STORED AS KUDU
        

        I think your example is the pretty much the same as the very bottom example in the main section labeled "Composite Key".

        Show
        caseyc casey added a comment - Hi Will, thanks for looking at this. I'm planning on supporting both CREATE TABLE t (k INT PRIMARY KEY) STORED AS KUDU and CREATE TABLE t (k1 INT, k2 STRING, PRIMARY KEY(k1, k2)) STORED AS KUDU I think your example is the pretty much the same as the very bottom example in the main section labeled "Composite Key".
        Hide
        wdberkeley_impala_f7d4 Will Berkeley added a comment -

        Ah jeese I missed that example. Sorry!

        Show
        wdberkeley_impala_f7d4 Will Berkeley added a comment - Ah jeese I missed that example. Sorry!
        Hide
        dtsirogiannis Dimitris Tsirogiannis added a comment -

        Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1
        Reviewed-on: http://gerrit.cloudera.org:8080/4414
        Reviewed-by: Alex Behm <alex.behm@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        dtsirogiannis Dimitris Tsirogiannis added a comment - Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1 Reviewed-on: http://gerrit.cloudera.org:8080/4414 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            dtsirogiannis Dimitris Tsirogiannis
            Reporter:
            caseyc casey
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development