Details

    • Type: Wish Wish
    • Status: Open
    • Priority: Trivial Trivial
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Query Processor
    • Labels:
      None

      Description

      It would be nice in several cases to be able to alias column names.

      Say someone in your company CREATEd a TABLE called important_but_named_poorly (alvin BIGINT, theodore BIGINT, simon STRING) PARTITIONED BY (dave STRING), that indexes the relationship between an actor (alvin), a target (theodore), and the interaction between them (simon), partitioned based on the date string (dave). Renaming the columns would break a million pipelines that are important but ownerless.

      It would be awesome to define an aliasing system as such:

      ALTER TABLE important_but_named_poorly REPLACE COLUMNS (actor BIGINT AKA alvin, target BIGINT AKA theodore, ixn STRING AKA simon) PARTITIONED BY (ds STRING AKA dave);

      ...which would mean that any user could, e.g., use the term "dave" to refer to ds if they really wanted to.

        Activity

        Hide
        Adam Kramer added a comment -

        The use case here is basically providing backwards compatibility. So for many users of a table, and many new users of a table, they are using the same table and want to refer to it as such; it is the canonical table.

        But sometimes the table was originally named with crummy names, and it'd be better and cleaner to document and train new people on the appropriate names.

        Views eat up the namespace and provide a level of misdirection that is not always desirable, but here are the two biggest limitations of views:

        • SELECT * is not fast. I can't SELECT * on a view and get data immediately in the same way that I would upon writing the same query. This is true even when the schema are exactly the same.
        • Partitions are not see-through. I can't use "show partitions" on a view or write any automated system based on the view to identify when new partitions land, which forces reference to the original table, and then all is lost.
        Show
        Adam Kramer added a comment - The use case here is basically providing backwards compatibility. So for many users of a table, and many new users of a table, they are using the same table and want to refer to it as such; it is the canonical table. But sometimes the table was originally named with crummy names, and it'd be better and cleaner to document and train new people on the appropriate names. Views eat up the namespace and provide a level of misdirection that is not always desirable, but here are the two biggest limitations of views: SELECT * is not fast. I can't SELECT * on a view and get data immediately in the same way that I would upon writing the same query. This is true even when the schema are exactly the same. Partitions are not see-through. I can't use "show partitions" on a view or write any automated system based on the view to identify when new partitions land, which forces reference to the original table, and then all is lost.
        Hide
        John Sichi added a comment -

        Since views are already a standard way of addressing this, wouldn't it be better to put effort into fixing any limitations there?

        Show
        John Sichi added a comment - Since views are already a standard way of addressing this, wouldn't it be better to put effort into fixing any limitations there?

          People

          • Assignee:
            Unassigned
            Reporter:
            Adam Kramer
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development