Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20313

consider making ROW__ID a 1st class object

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.11.0
    • None
    • Transactions
    • None

    Description

      ROW_ID, which is a struct that represents a unique row ID within a partition of a full CRUD transactional table is currently modeled as a VirtualColumn. Acid metadata columns from which ROW_ID is built are actually stored in the data file.

      There is no end to special handling of acid metadata columns in the code to make this work.

      Perhaps a better approach is to add struct column to an acid table at creation time and make it a 1st class citizen visible in the metastore. 'select count ....' would need special handling to remove it. There may need to be a way to make these columns read-only.

      For data added via Load Data, Add Partition, etc (i.e. original files in a CRUD table), acid reader would have fill in the values as it does today.

      This would make schema evolution, PPD, projection pruning work seamlessly.
      This should also make adding formats other than ORC in full CRUD tables easy.

      This will likely be painful but should be investigated.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: