Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20435

Failed Dynamic Partition Insert into insert only table may loose transaction metadata

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Won't Fix
    • 3.0.0
    • 4.0.0-alpha-2
    • Transactions
    • None
    • n/a

    Description

      TxnHandler.enqueueLockWithRetry() has an optimization where it doesn't writ to TXN_COMPONENTS if the write is a dynamic partition insert because it expects to write to this table from addDynamicPartitions().

      For insert-only, transactional tables, we create the target dir and start writing to it before addDynamicPartitions() is called. So if a txn is aborted, we may have a delta dir in the partition but no corresponding entry in TXN_COMPONENTS. This means TxnStore.cleanEmptyAbortedTxns() may clean up TXNS entry for the aborted transaction before Compactor removes this delta dir, at which point it looks like committed data.

      Streaming API V2 with dynamic partition mode also has this problem.

      Full CRUD are currently immune to this since they rely on "move" operation in MoveTask but longer term they should follow the same model as insert-only tables.

      Attachments

        Issue Links

          Activity

            People

              ekoifman Eugene Koifman
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: