Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4166

Introduce SORT BY clause in CREATE TABLE statement

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.2, Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0
    • Impala 2.9.0
    • Catalog

    Description

      This issue is intended as a usability improvement for IMPALA-4163 where the SORT BY columns can be specified directly in the table definition like this:

      CREATE TABLE t (day INT, hour INT)
      PARTITIONED BY (year INT, month INT)
      SORT BY (day, hour);
      

      The above table creation has the effect that all inserts into the table have an implicit "sortby(day,hour)" plan hint applied. See IMPALA-4163 for details on the hint.

      Just like with the "sortby" hint the SORT BY clause can only contain non-partition columns for HDFS tables and non-primary key columns for Kudu tables.

      This has the following benefits:

      • Users will not have to remember to put the sortby hint in all insert statements.
      • The SORT BY columns are a physical design choice, so it makes sense to store them as part of the table metadata.
      • This is a convenience feature. It has the same effect as the sortby() hint for INSERT statements, but doesn't require the user to remember to include the hint for every INSERT statement.

      Challenges:

      • The Hive Metastore has no SORT BY concept, so we'll need to store the information in the generic TBLPROPERTIES map.
      • No other engines (Hive, Spark) will understand this table property. That means that data written by those engines will require an explicit sorting hint (as far as that's available).

      Attachments

        Issue Links

        There are no Sub-Tasks for this issue.

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            lv Lars Volker
            alex.behm Alexander Behm
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment