Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14841 Replication - Phase 2
  3. HIVE-19739

Bootstrap REPL LOAD to use checkpoints to validate and skip the loaded data/metadata.

    Details

      Description

      Currently. bootstrap REPL LOAD have added checkpoint identifiers in DB/table/partition object properties once the data/metadata related to the object is successfully loaded.

      If the Db exist and is not empty, then currently we are throwing exception. But need to support it for the retry scenario after a failure.

      If there is a retry of bootstrap load using the same dump, then instead of throwing error, we should check if any of the tables/partitions are completely loaded using the checkpoint identifiers. If yes, then skip it or else drop/create them again.

      If the bootstrap load is performed using different dump, then it should throw exception.

      Allow bootstrap on empty Db only if ckpt property is not set. Also, if bootstrap load is completed on the target Db, then shouldn't allow bootstrap retry at all.

        Attachments

        1. HIVE-19739.01.patch
          59 kB
          Sankar Hariappan
        2. HIVE-19739.02.patch
          83 kB
          Sankar Hariappan
        3. HIVE-19739.03.patch
          89 kB
          Sankar Hariappan
        4. HIVE-19739.04.patch
          90 kB
          Sankar Hariappan
        5. HIVE-19739.01-branch-3.patch
          90 kB
          Sankar Hariappan

          Issue Links

            Activity

              People

              • Assignee:
                sankarh Sankar Hariappan
                Reporter:
                sankarh Sankar Hariappan
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: