Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21769 Support Partition level filtering for hive replication command
  3. HIVE-21774

Support partition level filtering for events with multiple partitions



    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • HiveServer2, repl
    • None


      Some of the events in hive can span across multiple partitions, table or even database. Events related to transactions, can span across multiple databases. When a transaction does some write operation, it is added to the write notification log table. During dump of commit transaction event, al the entries present in the write notification log table for that transaction is read and is added to the commit transaction message. In case partition filter is supplied for the dump, only those partitions which are part of the policy should be added to the commit txn message.

      • All the events which are not partition level will be added to the list of events to be dumped.
      • Pass the filter condition for the policy to commit transaction message handler (events which are not partition level).
      • During dump for commit transaction event, extract the events added in the write notification log table and compare it with the filter condition.
      • If the event from write notification log satisfies the filter condition, then add it to the commit transaction message.
      • If filter condition is null, then add all the events from write notification log table to commit transaction message.
      • For events which does not have partition level info like open txn, abort txn etc, just dump the events without any filtering. So it may happen that some of events which are not related to any of the satisfying partition, may get replayed.




            maheshk114 mahesh kumar behera
            maheshk114 mahesh kumar behera
            0 Vote for this issue
            2 Start watching this issue