Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5317

Implement insert, update, and delete in Hive with full ACID support

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • Transactions
    • None

    Description

      Many customers want to be able to insert, update and delete rows from Hive tables with full ACID support. The use cases are varied, but the form of the queries that should be supported are:

      • INSERT INTO tbl SELECT …
      • INSERT INTO tbl VALUES ...
      • UPDATE tbl SET … WHERE …
      • DELETE FROM tbl WHERE …
      • MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ...
      • SET TRANSACTION LEVEL …
      • BEGIN/END TRANSACTION

      Use Cases

      • Once an hour, a set of inserts and updates (up to 500k rows) for various dimension tables (eg. customer, inventory, stores) needs to be processed. The dimension tables have primary keys and are typically bucketed and sorted on those keys.
      • Once a day a small set (up to 100k rows) of records need to be deleted for regulatory compliance.
      • Once an hour a log of transactions is exported from a RDBS and the fact tables need to be updated (up to 1m rows) to reflect the new data. The transactions are a combination of inserts, updates, and deletes. The table is partitioned and bucketed.

      Attachments

        1. InsertUpdatesinHive.pdf
          186 kB
          Owen O'Malley

        Issue Links

        1.
        Define API for RecordUpdater and UpdateReader Sub-task Resolved Owen O'Malley Actions
        2.
        Transaction manager for Hive Sub-task Resolved Alan Gates Actions
        3.
        Insert, update, delete functionality needs a compactor Sub-task Resolved Alan Gates Actions
        4.
        Need new "show" functionality for transactions Sub-task Resolved Alan Gates Actions
        5.
        Need to write documentation for ACID work Sub-task Resolved Alan Gates Actions
        6.
        Streaming support in Hive Sub-task Resolved Roshan Naik Actions
        7.
        Fix vectorized input to work with ACID Sub-task Resolved Owen O'Malley Actions
        8.
        Disable CombineInputFormat for InputFormats that don't use FileSplit Sub-task Resolved Owen O'Malley Actions
        9.
        Fix OrcRecordUpdater to use sync instead of flush Sub-task Resolved Owen O'Malley Actions
        10.
        Fix reading partial ORC files while they are being written Sub-task Resolved Owen O'Malley Actions
        11.
        Need file sink operators that work with ACID Sub-task Closed Alan Gates Actions
        12.
        RecordUpdater should extend RecordWriter Sub-task Resolved Alan Gates Actions
        13.
        Add ROW__ID VirtualColumn Sub-task Closed Eugene Koifman Actions
        14.
        RecordUpdater should read virtual columns from row Sub-task Closed Alan Gates Actions
        15.
        Modify parser to support new grammar for Insert,Update,Delete Sub-task Closed Eugene Koifman Actions
        16.
        OrcRecordUpdater needs to implement getStats Sub-task Closed Alan Gates Actions
        17.
        Generate plans for insert, update, and delete Sub-task Closed Alan Gates Actions
        18.
        Update privileges to check for update and delete Sub-task Closed Alan Gates Actions
        19.
        Update language manual for insert, update, and delete Sub-task Resolved Alan Gates Actions
        20.
        Compactions need to update table/partition stats Sub-task Resolved Eugene Koifman Actions

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            omalley Owen O'Malley Assign to me
            omalley Owen O'Malley
            Votes:
            34 Vote for this issue
            Watchers:
            171 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment