Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.7.0
    • Component/s: replication
    • Labels:
      None

      Description

      Wrote a test that was doing some more intense verification of equality of two tables and I was surprised to find that the tables were in fact not equal.

      Digging into it some more, I eventually found that the keys and values were identical, save for the timestamp. Despite the Mutations coming from the local WAL having timestamps set by the server, these got lost.

      Specifically, the "real" timestamp is stored on the ServerMutation, not each ColumnUpdate. On the peer, when the BatchWriter makes a shallow copy of the (Server)Mutation to apply on the target table for replication, we lose that ServerMutation and get a "regular" Mutation which has updates that don't have any timestamp set. If the BatchWriter didn't make the shallow copy, this should work.

        Issue Links

          Activity

          Hide
          Josh Elser added a comment -

          The "proper" fix would be to push the systemTimestamp down into each ColumnUpdate but I'm worried about the unintended consequences that might arise from doing that in the "regular" pipeline. I can account for this within the replication code.

          Show
          Josh Elser added a comment - The "proper" fix would be to push the systemTimestamp down into each ColumnUpdate but I'm worried about the unintended consequences that might arise from doing that in the "regular" pipeline. I can account for this within the replication code.
          Hide
          Josh Elser added a comment -

          Despite what I tried, I can't get the timestamp to be propagated through to the server without completely rewriting all of the Mutations that are being replicated which is really terrible. Some situations that kept me from being able to get there

          • ColumnUpdates on Mutation are immutable and not stored unserialized in a Mutation
          • Tried to make BatchWriter aware of ServerMutations and copy them directly, but they would get thrown away in the conversion to TMutation done inside the BatchWriter

          If anyone has any smart ideas on how things could be reworked (assuming Keith Turner would have the best idea since I think you did logical time), I would be more than accepting.

          Show
          Josh Elser added a comment - Despite what I tried, I can't get the timestamp to be propagated through to the server without completely rewriting all of the Mutations that are being replicated which is really terrible. Some situations that kept me from being able to get there ColumnUpdates on Mutation are immutable and not stored unserialized in a Mutation Tried to make BatchWriter aware of ServerMutations and copy them directly, but they would get thrown away in the conversion to TMutation done inside the BatchWriter If anyone has any smart ideas on how things could be reworked (assuming Keith Turner would have the best idea since I think you did logical time), I would be more than accepting.
          Hide
          ASF subversion and git services added a comment -

          Commit b062a0bd3ed388f89bc04dfa2903bf3cc951976c in accumulo's branch refs/heads/master from Josh Elser
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=b062a0b ]

          ACCUMULO-2925 Create regular Mutations from ServerMutations when applying replication data on a peer

          Mutations do not store unserialized ColumnUpdates, but only generate them
          on demand via the getter. This is intended to create an efficient implementation
          (both performance and size) while preseving immutability.

          Server-assigned timestamps work around this immutability by wrapping normal
          Mutations in a ServerMutation and ColumnUpdates with ServerColumnUpdates. By doing
          this, ServerMutations can "fake" the timestamp on ColumnUpdates that otherwise
          do not have a timestamp set.

          In the context of replication, this is still a problem as all Mutations that are
          sent to a peer are ServerMutations (as we read them from a WAL). These Mutations are
          deserialized and passed into a BatchWriter to apply to the local instance; however, the
          BatchWriter is ignorant of ServerMutations and the special timestamp handling.

          When the BatchWriter makes a "copy" of the Mutation (see ACCUMULO-2915), despite this
          being a shallow copy, the server-assigned timestamp is lost by creating a regular
          Mutation from what was a ServerMutation. Even if this were possible, the TMutation
          class, which the BatchWriter eventually uses to send to the Mutations to a TabletServer,
          is also ignorant of the ServerMutation timestamp without modification of the serialization
          and TMutation class.

          As such, the only option left is to, when encountering ServerMutations in the BatchWriterReplicationReplayer
          code, we must recreate new Mutations, applying the possibly present server-timestamp to
          each new Mutation we create to ensure that the timestamp is correctly propagated to this peer.

          Show
          ASF subversion and git services added a comment - Commit b062a0bd3ed388f89bc04dfa2903bf3cc951976c in accumulo's branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=b062a0b ] ACCUMULO-2925 Create regular Mutations from ServerMutations when applying replication data on a peer Mutations do not store unserialized ColumnUpdates, but only generate them on demand via the getter. This is intended to create an efficient implementation (both performance and size) while preseving immutability. Server-assigned timestamps work around this immutability by wrapping normal Mutations in a ServerMutation and ColumnUpdates with ServerColumnUpdates. By doing this, ServerMutations can "fake" the timestamp on ColumnUpdates that otherwise do not have a timestamp set. In the context of replication, this is still a problem as all Mutations that are sent to a peer are ServerMutations (as we read them from a WAL). These Mutations are deserialized and passed into a BatchWriter to apply to the local instance; however, the BatchWriter is ignorant of ServerMutations and the special timestamp handling. When the BatchWriter makes a "copy" of the Mutation (see ACCUMULO-2915 ), despite this being a shallow copy, the server-assigned timestamp is lost by creating a regular Mutation from what was a ServerMutation. Even if this were possible, the TMutation class, which the BatchWriter eventually uses to send to the Mutations to a TabletServer, is also ignorant of the ServerMutation timestamp without modification of the serialization and TMutation class. As such, the only option left is to, when encountering ServerMutations in the BatchWriterReplicationReplayer code, we must recreate new Mutations, applying the possibly present server-timestamp to each new Mutation we create to ensure that the timestamp is correctly propagated to this peer.
          Hide
          ASF subversion and git services added a comment -

          Commit 03c93c9dd74abadad027cec6a934c92fd1d58f8c in accumulo's branch refs/heads/master from Josh Elser
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=03c93c9 ]

          ACCUMULO-2925 Need to preserve replicationSource on the Mutation

          The replicationSource on the Mutation is the information which prevents cycles
          in the replication graph from infinitely replicating information. Each replicationSource
          on a Mutation is the `replication.name` for a system from which that Mutation came.

          We can later use this set to determine if we need to replicate this Mutation
          to a given peer by observing if the `replication.name` of our peer already
          exists in the replicationSources.

          Show
          ASF subversion and git services added a comment - Commit 03c93c9dd74abadad027cec6a934c92fd1d58f8c in accumulo's branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=03c93c9 ] ACCUMULO-2925 Need to preserve replicationSource on the Mutation The replicationSource on the Mutation is the information which prevents cycles in the replication graph from infinitely replicating information. Each replicationSource on a Mutation is the `replication.name` for a system from which that Mutation came. We can later use this set to determine if we need to replicate this Mutation to a given peer by observing if the `replication.name` of our peer already exists in the replicationSources.
          Hide
          ASF subversion and git services added a comment -

          Commit cc7f91b0b23db6addfb05e7d443235fe4e46124e in accumulo's branch refs/heads/master from Josh Elser
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=cc7f91b ]

          ACCUMULO-2925 Remove inadvertently deleted log message

          Show
          ASF subversion and git services added a comment - Commit cc7f91b0b23db6addfb05e7d443235fe4e46124e in accumulo's branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=cc7f91b ] ACCUMULO-2925 Remove inadvertently deleted log message
          Hide
          ASF subversion and git services added a comment -

          Commit 06760572e9325ba1f622a998085853a0098de783 in accumulo's branch refs/heads/master from Josh Elser
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=0676057 ]

          ACCUMULO-2925 Add test to ensure that replicationSource(s) are preserved before replay

          Show
          ASF subversion and git services added a comment - Commit 06760572e9325ba1f622a998085853a0098de783 in accumulo's branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=0676057 ] ACCUMULO-2925 Add test to ensure that replicationSource(s) are preserved before replay
          Hide
          ASF subversion and git services added a comment -

          Commit 4d7e90aeef3a6de6a36a30a188d5c1bc564ade3a in accumulo's branch refs/heads/master from Josh Elser
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=4d7e90a ]

          ACCUMULO-2925 Add warning about server-assigned timestamps with replication

          Leave a note about updates to equal keys that have different updates that are
          assigned the same timestamp by the server.

          Show
          ASF subversion and git services added a comment - Commit 4d7e90aeef3a6de6a36a30a188d5c1bc564ade3a in accumulo's branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=4d7e90a ] ACCUMULO-2925 Add warning about server-assigned timestamps with replication Leave a note about updates to equal keys that have different updates that are assigned the same timestamp by the server.

            People

            • Assignee:
              Josh Elser
              Reporter:
              Josh Elser
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development