Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-18923

ValidWriteIdList snapshot per table can be cached for multi-statement transactions.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.0.0
    • None
    • Transactions

    Description

      Currently, for each query within a multi-statement transaction, it would request metastore/TxnHandler to build ValidWriteIdList snapshot for the given table. This is costly as it need to talk to metastore RDBMS. But, the snapshot won't change within the duration of transaction. So, it make sense to cache it within QueryTxnManager for better performance.

      However, each txn should be able to view their own written rows. So, when a transaction allocates writeId to write on a table, then the cached ValidWriteIdList on this table should be recalculated as follows.

      Original ValidWriteIdList: {hwm=10, open/aborted=5,6} – (10 is allocated by txn < current txn_id).

      Allocated writeId for this txn: 13 – (11 and 12 are taken by some other txn > current txn_id)

      New ValidWriteIdList: {hwm=12, open/aborted=5,6,11, 12} – (11, 12 are added to invalid list, so the snapshot remains same).

      Attachments

        Issue Links

          Activity

            People

              gupta.nikhil0007 Nikhil Gupta
              sankarh Sankar Hariappan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: