Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7046

Propagate addition of new columns to partition schema

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.11.0, 0.12.0, 0.13.0
    • None
    • Database/Schema
    • None

    Description

      Hive reads data according to the partition schema, not the table schema (because of HIVE-3833). ALTER TABLE only updates the table schema, and the changes are not propagated to partitions. Thus, the schema of a partition will differ from that of the table after altering the table schema; this is done to preserve the ability to read existing data, particularly when using binary formats such as RCFile. Binary formats do not allow changing the type of a field because of the way serialization works; a field serialized as a string will be displayed incorrectly if read as an integer.

      Unfortunately, as a side effect, this behavior limits the ability to add new columns to already exiting partitions using ALTER TABLE ADD COLUMNS. A possible workaround is to manually recreate the partitions, but this process could be unnecessarily cumbersome if the number of partitions is high. New columns should be propagated to existing partitions automatically instead.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            mdominguez@cloudera.com Mariano Dominguez

            Dates

              Created:
              Updated:

              Slack

                Issue deployment