Details

      Description

      We need to get some kudu tables loaded as part of data loading. We have to think about how we want tables to be distributed, which probably means any tables we load will have to be custom loaded for the time being. (If we can think of a better way to create kudu tables with an implicit distribution, that might be nice later on.)

      Since it will be expensive to add any kudu tables in this way, we probably can't load all tables we load for other table formats (for now). We should identify the subset of tables we will need for some EE tests, stress tests, qgen tests.

        Issue Links

          Activity

          Hide
          mjacobs Matthew Jacobs added a comment -
          Show
          mjacobs Matthew Jacobs added a comment - cc Dimitris Tsirogiannis
          Hide
          mjacobs Matthew Jacobs added a comment -

          I just realized this is going to be complicated by:

          1. Conflicts with current work for IMPALA-3719 (and children). I'll work with Dimitris Tsirogiannis to get IMPALA-3719 wrapped up.
          2. Lack of timestamp support (IMPALA-3557), which is used by the functional schema extensively. I think we can probably do something hacky like we do for Avro which also doesn't support timestamp, but the code is a mess so this will be a bit more difficult than expected.
          Show
          mjacobs Matthew Jacobs added a comment - I just realized this is going to be complicated by: Conflicts with current work for IMPALA-3719 (and children). I'll work with Dimitris Tsirogiannis to get IMPALA-3719 wrapped up. Lack of timestamp support ( IMPALA-3557 ), which is used by the functional schema extensively. I think we can probably do something hacky like we do for Avro which also doesn't support timestamp, but the code is a mess so this will be a bit more difficult than expected.
          Hide
          mjacobs Matthew Jacobs added a comment -

          Commit https://github.com/apache/incubator-impala/commit/c7fa03286b473a34cdb170f8c89c261fb02d17a6 added support for the majority of the functional schema.

          However, there are some limitations from Kudu:
          a) Primary key columns must currently be the first columns
          in the table definition (KUDU-1271).
          b) Primary key columns cannot be nullable (KUDU-1570).

          As a result, alltypesagg has to be added as a view with an underlying table that has a unique, non-nullable PK. Also, some tables have been left out for the same reasons and given they're not used in general tests (e.g. hdfs specific tests).

          Show
          mjacobs Matthew Jacobs added a comment - Commit https://github.com/apache/incubator-impala/commit/c7fa03286b473a34cdb170f8c89c261fb02d17a6 added support for the majority of the functional schema. However, there are some limitations from Kudu: a) Primary key columns must currently be the first columns in the table definition ( KUDU-1271 ). b) Primary key columns cannot be nullable ( KUDU-1570 ). As a result, alltypesagg has to be added as a view with an underlying table that has a unique, non-nullable PK. Also, some tables have been left out for the same reasons and given they're not used in general tests (e.g. hdfs specific tests).

            People

            • Assignee:
              mjacobs Matthew Jacobs
              Reporter:
              mjacobs Matthew Jacobs
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development