Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-19271

Persist revision-safeTime mapping in meta-storage

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0
    • None

    Description

      IEP-98 states:

      When creating a message M telling the cluster about a schema update activation moment, choose the message timestamp Tm (moving safeTime forward) equal to Now, but assign Tu (activation moment) contained in that M to be Tm+DD 

      This is hard to achieve.

      Problem

      We need Tu==Tm+DD. Right now, with what we have in IGNITE-19028, it's not straightforward. This is because we have too many actors:

      • There's a client, that chooses Tu, because it's the only actor that can affect message content.
      • There's a meta-storage lease-holder, or leader, that chooses Tm.
      • There's everybody else, who expect a correspondence between Tu and Tm.

      First two actors are important, because they have independent clocks, but must coordinate the same event. This is impossible with described protocol.

      Discussion

      Let's consider these two solutions:

      1. Client generates Tm.
      2. Meta-storage generates Tu.

      Option 1 is out of question, there must be only a single node at any given moment in time, that's responsible for the linear order of time in messages.

      What about option 2? Since meta-storage doesn't know anything about commands semantics, it can't really generate any data. So this solution doesn't work either.

      Solution

      Combined solution could be the following:

      • Client sends DD as part of the command (this is not a constant, user can configure it, if they really feel like doing it)
      • Meta-storage generates Tm
      • Every node, upon receiving the update, calculates Tu

      This could work, if nodes would have never been restarted. There's one problem that needs to be solved: recovering the values of Tm from the (old) data upon node restart.

      This can be achieved by persisting safeTime along with revision as a part of metadata, that can be retrieved back through the meta-storage service API.

      In other words:

      1. Client sends

      schema.latest   = 5
      schema.5.data   = ...
      schema.5.dd     = 30s

      2. Lease-holder adds meta-data to the command:

      safeTime = 10:10
      

      3. Meta-storage listener writes the data:

      revision = 33
          schema.latest = 5
          schema.5.data = ...
          schema.5.dd   = 30s
      
      revision.33.safeTime = 10:10:00

       

      How can you read Tu:

      • read "schema.5.dd";
      • read its revision, it's 33;
      • read a timestamp of revision 33 via specialized API;
      • add two values together.

      Implications and restrictions

      There's a cleanup process in the meta-storage. It will eventually remove any "revision.x.safeTime" values, because corresponding revision became obsolete.

      But, we should somehow preserve timestamps of revisions that are used by schemas. Such behaviour can be achieved, if components can reserve a revision, and meta-storage can't compact it unless the reservation has been revoked.

      Attachments

        Issue Links

          Activity

            People

              sdanilov Semyon Danilov
              ibessonov Ivan Bessonov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: