Accumulo
  1. Accumulo
  2. ACCUMULO-2915

Avoid copying all Mutations when using a TabletServerBatchWriter

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.7.0
    • Fix Version/s: 1.8.0
    • Component/s: client
    • Labels:
      None

      Description

      Currently in the TabletServerBatchWriter, the following behavior is exhibited:

          // create a copy of mutation so that after this method returns the user
          // is free to reuse the mutation object, like calling readFields... this
          // is important for the case where a mutation is passed from map to reduce
          // to batch writer... the map reduce code will keep passing the same mutation
          // object into the reduce method
          m = new Mutation(m);
          
          totalMemUsed += m.estimatedMemoryUsed();
          mutations.addMutation(table, m);
          totalAdded++;
      

      This means all data is copied twice when writing. The logic for doing this is a bit dubious, since not all clients are going to be subject to MapReduce's use of references.

      It'd be good if we provided users with a way of signaling that there's no need to copy the mutation payload. Josh Elser suggested creating something akin to an ImmutableMutation, which help avoid some of the fears the batchwriter attempts to defend against.

        Issue Links

          Activity

          Hide
          Josh Elser added a comment -

          This is big (in a good way), but it also goes against the current implementation. I don't think I've ever stumbled across use of a Mutation that wasn't "create", "add updates", "submit" and then "throwaway", despite that being the default case that the BatchWriter expects. Despite it being awkward to me, we're going to have to keep that default action to ensure we don't break anyone. They'll have to opt-in to a zero-copy variant.

          I like the ImmutableMutation idea, but as long as it serializes in the same way that Mutation currently does, it shouldn't result in any changes server-side. I was talking with William Slacum about this, making ImmutableMutation Writable does introduce interface confusion because it shouldn't be used as a normal Writable object is.

          This makes me wonder if we should consider using Kryo for serialization of "mutations" instead of the Writable interface. Last time I did some benchmarks, I think Kryo was faster that Writable too which would have another gain. I don't want to dirty up this ticket with scope-creep, but I wanted to mention it publicly before I forgot again.

          Show
          Josh Elser added a comment - This is big (in a good way), but it also goes against the current implementation. I don't think I've ever stumbled across use of a Mutation that wasn't "create", "add updates", "submit" and then "throwaway", despite that being the default case that the BatchWriter expects. Despite it being awkward to me, we're going to have to keep that default action to ensure we don't break anyone. They'll have to opt-in to a zero-copy variant. I like the ImmutableMutation idea, but as long as it serializes in the same way that Mutation currently does, it shouldn't result in any changes server-side. I was talking with William Slacum about this, making ImmutableMutation Writable does introduce interface confusion because it shouldn't be used as a normal Writable object is. This makes me wonder if we should consider using Kryo for serialization of "mutations" instead of the Writable interface. Last time I did some benchmarks, I think Kryo was faster that Writable too which would have another gain. I don't want to dirty up this ticket with scope-creep, but I wanted to mention it publicly before I forgot again.
          Hide
          Keith Turner added a comment -

          Note that the data is not copied. The new Mutation object just points to the private data of the other mutation. This allows readfields to be called on the passed in mutation w/o causing problems.

            public Mutation(Mutation m) {
              m.serialize();
              this.row = m.row;
              this.data = m.data;
              this.entries = m.entries;
              this.values = m.values;
            }
          
          Show
          Keith Turner added a comment - Note that the data is not copied. The new Mutation object just points to the private data of the other mutation. This allows readfields to be called on the passed in mutation w/o causing problems. public Mutation(Mutation m) { m.serialize(); this .row = m.row; this .data = m.data; this .entries = m.entries; this .values = m.values; }
          Hide
          Sean Busbey added a comment -

          While I also don't want to clutter this ticket, I would like to point out that we currently have lots of serialization libraries in use, almost all of which could be replaced with Avro. Barring a major performance differentiator (on the order of 2x+), such a change would greatly simplify our long term maintenance.

          So if we do make any additional changes in serialization points, please make them pluggable so that it's easier to do comparisons and consolidation.

          Show
          Sean Busbey added a comment - While I also don't want to clutter this ticket, I would like to point out that we currently have lots of serialization libraries in use, almost all of which could be replaced with Avro. Barring a major performance differentiator (on the order of 2x+), such a change would greatly simplify our long term maintenance. So if we do make any additional changes in serialization points, please make them pluggable so that it's easier to do comparisons and consolidation.
          Hide
          William Slacum added a comment -

          Keith Turner serves me right for not digging deeper in there.

          Show
          William Slacum added a comment - Keith Turner serves me right for not digging deeper in there.
          Hide
          Josh Elser added a comment -

          Keith Turner serves me right for not digging deeper in there.

          Not the first time I've been Turner'ed either. So really this just boils down to creating an extra instance of Mutation which still could be improved, but likely at much less of a gain.

          Show
          Josh Elser added a comment - Keith Turner serves me right for not digging deeper in there. Not the first time I've been Turner'ed either. So really this just boils down to creating an extra instance of Mutation which still could be improved, but likely at much less of a gain.
          Hide
          William Slacum added a comment -

          There may still be wiggle room, if we can somehow avoid the copy in Mutation.ByteBuffer#toArray for the internal buffer. An ImmutableMutation could simply keep the original byte array, an offset, and a length, and upon calls to #write, just write out what it already had in memory instead of creating a perfectly sized byte array.

          Show
          William Slacum added a comment - There may still be wiggle room, if we can somehow avoid the copy in Mutation.ByteBuffer#toArray for the internal buffer. An ImmutableMutation could simply keep the original byte array, an offset, and a length, and upon calls to #write , just write out what it already had in memory instead of creating a perfectly sized byte array.
          Hide
          Mike Drob added a comment -

          Anecdotal evidence against sharing state - java.util.String recently switched to creating a copy for #substring instead of storing a pointer, offset, and length. The rationale that I read was that the average size was small enough that the copy was quick, and the cost of extra references and indirection was greater than the benefit of sharing data.

          Show
          Mike Drob added a comment - Anecdotal evidence against sharing state - java.util.String recently switched to creating a copy for #substring instead of storing a pointer, offset, and length. The rationale that I read was that the average size was small enough that the copy was quick, and the cost of extra references and indirection was greater than the benefit of sharing data.
          Hide
          William Slacum added a comment -

          Yeah, I was aware of that. I'm not trying to share state. The current Mutation implementation does a Mutation.ByteBuffer#toArray before creating the new Mutation which holds on to the old references and then sets the Buffer to null. Looking at the code, this actually might be problematic because:

            public Mutation(Mutation m) {
              m.serialize();
              this.row = m.row;
              this.data = m.data;
              this.entries = m.entries;
              this.values = m.values;
            }
          

          ends up calling:

            private void serialize() {
             if (buffer != null) {
               data = buffer.toArray();
               buffer = null;
             }
           }
          

          So now I'm my user, who should have a Mutation object I can still use. Let's try to call Mutation#put on that bad boy...

            private void put(byte[] cf, int cfLength, byte[] cq, int cqLength, byte[] cv, boolean hasts, long ts, boolean deleted, byte[] val, int valLength) {
              if (buffer == null) {
                throw new IllegalStateException("Can not add to mutation after serializing it");
              }
            ....
          }
          

          So I think Mutation may have an oddball workflow. A quick test confirms this:

              public static void main(String[] args) {
                  Mutation m = new Mutation("row");
                  m.put("cf", "cq1", "foo");
                  // let's simulate what a batch scanner does
                  new Mutation(m);
                  // now let's try to reuse the original mutation
                  try {
                      m.put("cf", "cq2", "bar");
                  } catch (IllegalStateException e) {
                      System.err.println("VICTORY");
                  }
              }
          

          Notice how glorious the victory was, especially if that was run in Eclipse and the default stderr output is red.

          Show
          William Slacum added a comment - Yeah, I was aware of that. I'm not trying to share state. The current Mutation implementation does a Mutation.ByteBuffer#toArray before creating the new Mutation which holds on to the old references and then sets the Buffer to null. Looking at the code, this actually might be problematic because: public Mutation(Mutation m) { m.serialize(); this .row = m.row; this .data = m.data; this .entries = m.entries; this .values = m.values; } ends up calling: private void serialize() { if (buffer != null ) { data = buffer.toArray(); buffer = null ; } } So now I'm my user, who should have a Mutation object I can still use. Let's try to call Mutation#put on that bad boy... private void put( byte [] cf, int cfLength, byte [] cq, int cqLength, byte [] cv, boolean hasts, long ts, boolean deleted, byte [] val, int valLength) { if (buffer == null ) { throw new IllegalStateException( "Can not add to mutation after serializing it" ); } .... } So I think Mutation may have an oddball workflow. A quick test confirms this: public static void main( String [] args) { Mutation m = new Mutation( "row" ); m.put( "cf" , "cq1" , "foo" ); // let's simulate what a batch scanner does new Mutation(m); // now let's try to reuse the original mutation try { m.put( "cf" , "cq2" , "bar" ); } catch (IllegalStateException e) { System .err.println( "VICTORY" ); } } Notice how glorious the victory was, especially if that was run in Eclipse and the default stderr output is red.
          Hide
          ASF subversion and git services added a comment -

          Commit b062a0bd3ed388f89bc04dfa2903bf3cc951976c in accumulo's branch refs/heads/master from Josh Elser
          [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=b062a0b ]

          ACCUMULO-2925 Create regular Mutations from ServerMutations when applying replication data on a peer

          Mutations do not store unserialized ColumnUpdates, but only generate them
          on demand via the getter. This is intended to create an efficient implementation
          (both performance and size) while preseving immutability.

          Server-assigned timestamps work around this immutability by wrapping normal
          Mutations in a ServerMutation and ColumnUpdates with ServerColumnUpdates. By doing
          this, ServerMutations can "fake" the timestamp on ColumnUpdates that otherwise
          do not have a timestamp set.

          In the context of replication, this is still a problem as all Mutations that are
          sent to a peer are ServerMutations (as we read them from a WAL). These Mutations are
          deserialized and passed into a BatchWriter to apply to the local instance; however, the
          BatchWriter is ignorant of ServerMutations and the special timestamp handling.

          When the BatchWriter makes a "copy" of the Mutation (see ACCUMULO-2915), despite this
          being a shallow copy, the server-assigned timestamp is lost by creating a regular
          Mutation from what was a ServerMutation. Even if this were possible, the TMutation
          class, which the BatchWriter eventually uses to send to the Mutations to a TabletServer,
          is also ignorant of the ServerMutation timestamp without modification of the serialization
          and TMutation class.

          As such, the only option left is to, when encountering ServerMutations in the BatchWriterReplicationReplayer
          code, we must recreate new Mutations, applying the possibly present server-timestamp to
          each new Mutation we create to ensure that the timestamp is correctly propagated to this peer.

          Show
          ASF subversion and git services added a comment - Commit b062a0bd3ed388f89bc04dfa2903bf3cc951976c in accumulo's branch refs/heads/master from Josh Elser [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=b062a0b ] ACCUMULO-2925 Create regular Mutations from ServerMutations when applying replication data on a peer Mutations do not store unserialized ColumnUpdates, but only generate them on demand via the getter. This is intended to create an efficient implementation (both performance and size) while preseving immutability. Server-assigned timestamps work around this immutability by wrapping normal Mutations in a ServerMutation and ColumnUpdates with ServerColumnUpdates. By doing this, ServerMutations can "fake" the timestamp on ColumnUpdates that otherwise do not have a timestamp set. In the context of replication, this is still a problem as all Mutations that are sent to a peer are ServerMutations (as we read them from a WAL). These Mutations are deserialized and passed into a BatchWriter to apply to the local instance; however, the BatchWriter is ignorant of ServerMutations and the special timestamp handling. When the BatchWriter makes a "copy" of the Mutation (see ACCUMULO-2915 ), despite this being a shallow copy, the server-assigned timestamp is lost by creating a regular Mutation from what was a ServerMutation. Even if this were possible, the TMutation class, which the BatchWriter eventually uses to send to the Mutations to a TabletServer, is also ignorant of the ServerMutation timestamp without modification of the serialization and TMutation class. As such, the only option left is to, when encountering ServerMutations in the BatchWriterReplicationReplayer code, we must recreate new Mutations, applying the possibly present server-timestamp to each new Mutation we create to ensure that the timestamp is correctly propagated to this peer.

            People

            • Assignee:
              William Slacum
              Reporter:
              William Slacum
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:

                Development