Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-1165

Retry required when recovering an empty log

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 0.19.0
    • None
    • None

    Description

      Reported by benjaminhindman. It's fairly non-intuitive that a 'fill' retry is required when recovering an empty log. Moreover, since retry is done via a 'delay' it means that you can't pause the clock before calling Log::Writer::start! The following tests show the multiple calls and at one point I added comments to explain the very esoteric reasoning here. Here are the sequence of events:

      First a replica is recovered with nothing but 0 is always a hole:


      Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned


      At this point the replica assumes it was promised to a coordinator with proposal 0 (that's the default metadata). Then an implicit promise request is made with proposal 1.


      Replica received implicit promise request with proposal 1


      Then the coordinator (via FillProcess) tries to fill the hole (position 0) explicitly:


      Coordinator attemping to fill missing position


      And the replica receives the request:


      Replica received explicit promise request for position 0 with proposal 1


      But the filling must be retried because the 0th position is implicitly promised to proposer 1 (the same coordinator!) but the replica won't allow it (because it might not be safe) so the FillProcess now tries with proposal number 2 (after the delay). While correct, this seems unfortunate (and not intuitive).

      Attachments

        Activity

          People

            jieyu Jie Yu
            jieyu Jie Yu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: