Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14841 Replication - Phase 2
  3. HIVE-16813

Incremental REPL LOAD should load the events in the same sequence as it is dumped.

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.1.0
    • 3.0.0
    • Hive, repl

    Description

      Currently, incremental REPL DUMP use $dumpdir/<eventID> to dump the metadata and data files corresponding to the event. The event is dumped in the same sequence in which it was generated.

      Now, REPL LOAD, lists the directories inside $dumpdir using listStatus and sort it using compareTo algorithm of FileStatus class which doesn't check the length before sorting it alphabetically.
      Due to this, the event-100 is processed before event-99 and hence making the replica database non-sync with source.

      Need to use a customized compareTo algorithm to sort the FileStatus.

      Attachments

        1. HIVE-16813.02.patch
          17 kB
          Sankar Hariappan
        2. HIVE-16813.01.patch
          13 kB
          Sankar Hariappan

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sankarh Sankar Hariappan Assign to me
            sankarh Sankar Hariappan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment