Avro
  1. Avro
  2. AVRO-327

Performance improvements to BinaryDecoder.readLong()

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: java
    • Labels:
      None

      Description

      AVRO-315 proposed performance improvements to readLong(), readFloat() and readDouble(). readLong() did not improve performance well for all. Scott proposed a better method (but requires a change in semantics and API). We'll carry on the discussion on that proposal here. AVRO-315 will be committed with changes for readFloat() and readDouble().

      1. AVRO-327-2.patch
        21 kB
        Thiruvalluvan M. G.
      2. AVRO-327.patch
        15 kB
        Doug Cutting

        Issue Links

          Activity

          Hide
          Scott Carey added a comment -

          This work was subsumed by DEV-392

          Show
          Scott Carey added a comment - This work was subsumed by DEV-392
          Hide
          Doug Cutting added a comment -

          > Should I resolve this as a duplicate?

          Yes, I think so.

          Show
          Doug Cutting added a comment - > Should I resolve this as a duplicate? Yes, I think so.
          Hide
          Scott Carey added a comment -

          I worked on Doug's patch and fixed most of the problems. But TestCompare and TestDataFile still fail. TestCompare fails because, I think, BinaryData class assumes that the decoder will not read past the required bytes. TestDataFile fails because we are not able to sync to the next sync point since that is already been read into the BufferedBinaryDecoder's buffer.

          As part of AVRO-392 I have modified the tests and BinaryData to deal with the buffering. The improvements in this ticket are now all included as a (large) subset of AVRO-392. Should I resolve this as a duplicate?

          Show
          Scott Carey added a comment - I worked on Doug's patch and fixed most of the problems. But TestCompare and TestDataFile still fail. TestCompare fails because, I think, BinaryData class assumes that the decoder will not read past the required bytes. TestDataFile fails because we are not able to sync to the next sync point since that is already been read into the BufferedBinaryDecoder's buffer. As part of AVRO-392 I have modified the tests and BinaryData to deal with the buffering. The improvements in this ticket are now all included as a (large) subset of AVRO-392 . Should I resolve this as a duplicate?
          Hide
          Thiruvalluvan M. G. added a comment -

          I worked on Doug's patch and fixed most of the problems. But TestCompare and TestDataFile still fail. TestCompare fails because, I think, BinaryData class assumes that the decoder will not read past the required bytes. TestDataFile fails because we are not able to sync to the next sync point since that is already been read into the BufferedBinaryDecoder's buffer.

          I think we'll also lose out on the ByteBufferInputStream optimization. In our eagerness to read more data than absolutely necessary, we'll end up having part data in the BufferedDecoder's buffer and the rest in ByteBufferInputStream killing latter's optimization.

          Show
          Thiruvalluvan M. G. added a comment - I worked on Doug's patch and fixed most of the problems. But TestCompare and TestDataFile still fail. TestCompare fails because, I think, BinaryData class assumes that the decoder will not read past the required bytes. TestDataFile fails because we are not able to sync to the next sync point since that is already been read into the BufferedBinaryDecoder's buffer. I think we'll also lose out on the ByteBufferInputStream optimization. In our eagerness to read more data than absolutely necessary, we'll end up having part data in the BufferedDecoder's buffer and the rest in ByteBufferInputStream killing latter's optimization.
          Hide
          Scott Carey added a comment -

          self quote:

          This performance difference isn't small, its a factor of 2.5 on the previous test for things that aren't fully optimized yet - including the readDouble() that read 8 bytes at a time. This has a large impact on the impact of other improvements.

          Wow, I really edited that too much.

          Try this:
          ... its a factor of 2.5 on the previous test – including the readDouble() that reads 8 bytes at a time – and many other things aren't optimized yet. This will magnify the impact of later improvements.

          Show
          Scott Carey added a comment - self quote: This performance difference isn't small, its a factor of 2.5 on the previous test for things that aren't fully optimized yet - including the readDouble() that read 8 bytes at a time. This has a large impact on the impact of other improvements. Wow, I really edited that too much. Try this: ... its a factor of 2.5 on the previous test – including the readDouble() that reads 8 bytes at a time – and many other things aren't optimized yet. This will magnify the impact of later improvements.
          Hide
          Scott Carey added a comment -

          Some bigger context:

          InputStream and OutputStream are slow and should be avoided as much as possible when copying chunks to/from them smaller than a couple hundred bytes on average.
          This performance difference isn't small, its a factor of 2.5 on the previous test for things that aren't fully optimized yet – including the readDouble() that read 8 bytes at a time. This has a large impact on the impact of other improvements. An improvement that currently helps by 10%, will help by 25% after this change. If one wants to be able to have one thread decode/encode at gigabit ethernet wire speed, avoiding inputStream.read() and OutputStream.write(byte b) is mandatory – even if you use a BufferedInputStream.

          This is not just for decoder/encoder, but also in various other places, where the assumed "pass data around" method is via InputStream and OutputStream. ByteBuffer, byte[], Channel, are good options for various use cases that perform much better when small reads/writes are done than an equivalent Input/Output stream.
          There will be more to change than just BinaryDecoder eventually, and a holistic approach is better than a patchwork one.

          To address Thiru's concerns, I think that it can be made even simpler:

          void f(InputStream in) {
             BinaryDecoder bin = new BinaryDecoder(in);
             AvroObject o = readAvro(bin);
             NonAvroObject no = readNonAvro(bin.inputStream());
          }
          

          BinaryDecoder can construct a specialized InputStream inner class on demand (and cache it).
          The contract would be that once an InputStream is given to a decoder, it should not be accessed directly – not any different than what happens when you wrap an input stream with a buffered input stream.
          Alternatively Decoder could implement BufferedInputStream itself – but that would force that on all implementations.

          If two readers need to readahead-buffer on the same data, a different API will be needed (ByteBuffers? something else? read methods that don't advance the position?).

          I can produce a patch next week for review.

          Show
          Scott Carey added a comment - Some bigger context: InputStream and OutputStream are slow and should be avoided as much as possible when copying chunks to/from them smaller than a couple hundred bytes on average. This performance difference isn't small, its a factor of 2.5 on the previous test for things that aren't fully optimized yet – including the readDouble() that read 8 bytes at a time. This has a large impact on the impact of other improvements. An improvement that currently helps by 10%, will help by 25% after this change. If one wants to be able to have one thread decode/encode at gigabit ethernet wire speed, avoiding inputStream.read() and OutputStream.write(byte b) is mandatory – even if you use a BufferedInputStream. This is not just for decoder/encoder, but also in various other places, where the assumed "pass data around" method is via InputStream and OutputStream. ByteBuffer, byte[], Channel, are good options for various use cases that perform much better when small reads/writes are done than an equivalent Input/Output stream. There will be more to change than just BinaryDecoder eventually, and a holistic approach is better than a patchwork one. To address Thiru's concerns, I think that it can be made even simpler: void f(InputStream in) { BinaryDecoder bin = new BinaryDecoder(in); AvroObject o = readAvro(bin); NonAvroObject no = readNonAvro(bin.inputStream()); } BinaryDecoder can construct a specialized InputStream inner class on demand (and cache it). The contract would be that once an InputStream is given to a decoder, it should not be accessed directly – not any different than what happens when you wrap an input stream with a buffered input stream. Alternatively Decoder could implement BufferedInputStream itself – but that would force that on all implementations. If two readers need to readahead-buffer on the same data, a different API will be needed (ByteBuffers? something else? read methods that don't advance the position?). I can produce a patch next week for review.
          Hide
          Doug Cutting added a comment -

          Here's a version of this that still needs debugging, as it fails tests. I've replaced many but not all references to BinaryDecoder with BufferedBinaryDecoder.

          Show
          Doug Cutting added a comment - Here's a version of this that still needs debugging, as it fails tests. I've replaced many but not all references to BinaryDecoder with BufferedBinaryDecoder.
          Hide
          Thiruvalluvan M. G. added a comment -

          We can add a method reset() in the Decoder interface. This method would throw away the remaining contents of the buffer in BufferedBinaryDecoder ...

          We don't really need a new reset() method to clear the buffer. We can do that in init(InputStream in) method itself.

          Show
          Thiruvalluvan M. G. added a comment - We can add a method reset() in the Decoder interface. This method would throw away the remaining contents of the buffer in BufferedBinaryDecoder ... We don't really need a new reset() method to clear the buffer. We can do that in init(InputStream in) method itself.
          Hide
          Doug Cutting added a comment -

          Your first case adds a single line of code, which isn't bad for consistently providing improved performance. The alternative (a non-buffering BinaryDecoder) would save a line of code but force such applications to perform more slowly.

          The second case can also be handled with the addition of single line of code:

          void f(InputStream in) {
              BinaryDecoder bin = new BinaryDecoder(in);
              AvroObject o = readAvro(bin);
              NonAvroObject no = readNonAvro(new BinaryDecoderInputStream(bin));
          }
          

          If Decoder were made an interface, then a BinaryDecoder could also be an InputStream, so no additional lines of code would be required.

          In any case, we don't have to argue about this forever. If you feel strongly, then we can add this as BufferedBinaryDecoder and then update all uses within Avro to use it in place of BinaryDecoder, so that we provide better performance, and add a caution to BinaryDecoder stating that BufferedBinaryDecoder should be used for best performance.

          Should I prepare a patch or would you like to? Scott?

          Show
          Doug Cutting added a comment - Your first case adds a single line of code, which isn't bad for consistently providing improved performance. The alternative (a non-buffering BinaryDecoder) would save a line of code but force such applications to perform more slowly. The second case can also be handled with the addition of single line of code: void f(InputStream in) { BinaryDecoder bin = new BinaryDecoder(in); AvroObject o = readAvro(bin); NonAvroObject no = readNonAvro( new BinaryDecoderInputStream(bin)); } If Decoder were made an interface, then a BinaryDecoder could also be an InputStream, so no additional lines of code would be required. In any case, we don't have to argue about this forever. If you feel strongly, then we can add this as BufferedBinaryDecoder and then update all uses within Avro to use it in place of BinaryDecoder, so that we provide better performance, and add a caution to BinaryDecoder stating that BufferedBinaryDecoder should be used for best performance. Should I prepare a patch or would you like to? Scott?
          Hide
          Thiruvalluvan M. G. added a comment -

          The code snippet would be something like:

          The examples assume that readAvro(Decoder d) and readNonAvro(InputStream in) are given.

          Example 1:

          
          
          class  XXX {
              private ValidatingDecoder d = new ValidatingDecoder(schema, new BinaryDecoder(null));
              void readAvro(InputStream in) {
                  d.init(in);
                  readAvro(d);
                  readNonAvro(in);
              }
          }
          
          v.
          
          class  XXX {
              BinaryDecoder bd = new BinaryDecoder(null);
              InputStream wrap = new BinaryDecoderInputStream(bd);
              private ValidatingDecoder d = new ValidatingDecoder(schema, bd);
              void readAvro(InputStream in) {
                  d.init(in);
                  readAvro(d);
                  readNonAvro(wrap);
              }
          }
          

          Example 2:

          
          void f(InputStream in) {
              AvroObject o = readAvro(in);
              NonAvroObject no = readNonAvro(in);
          }
          
          void readAvro(InputStream in) {
             return readAvro(new BinaryDecoder(in));
          }
          
          v.
          
          public Pair {
              AvroObject first;
              InputStream second;
          }
          
          void f(InputStream in)  {
              Pair<AvroObject, InputStream> p = readAvro(in);
              NonAvroObject no = readNonAvro(p.second);
          }
          
          Pair readAvro(InputStream in) {
              BinaryDecoder d = new BinaryDecoder(in);
              AvroObject o = readAvro(d);
              return new Pair(o, new BinaryDecoderInputStream(d));
          }
          
          
          Show
          Thiruvalluvan M. G. added a comment - The code snippet would be something like: The examples assume that readAvro(Decoder d) and readNonAvro(InputStream in) are given. Example 1: class XXX { private ValidatingDecoder d = new ValidatingDecoder(schema, new BinaryDecoder( null )); void readAvro(InputStream in) { d.init(in); readAvro(d); readNonAvro(in); } } v. class XXX { BinaryDecoder bd = new BinaryDecoder( null ); InputStream wrap = new BinaryDecoderInputStream(bd); private ValidatingDecoder d = new ValidatingDecoder(schema, bd); void readAvro(InputStream in) { d.init(in); readAvro(d); readNonAvro(wrap); } } Example 2: void f(InputStream in) { AvroObject o = readAvro(in); NonAvroObject no = readNonAvro(in); } void readAvro(InputStream in) { return readAvro( new BinaryDecoder(in)); } v. public Pair { AvroObject first; InputStream second; } void f(InputStream in) { Pair<AvroObject, InputStream> p = readAvro(in); NonAvroObject no = readNonAvro(p.second); } Pair readAvro(InputStream in) { BinaryDecoder d = new BinaryDecoder(in); AvroObject o = readAvro(d); return new Pair(o, new BinaryDecoderInputStream(d)); }
          Hide
          Doug Cutting added a comment -

          > This should work most of the time except when using a validating or resolving decoders, where the usage becomes somewhat cumbersome.

          How would it be significantly more cumbersome than it is currently? Instead of performing non-Avro IO on the InputStream that you passed to ValidatingDecoder or ResolvingDecoder you'd perform non-Avro IO on the BinaryDecoder that you passed these. If an InputStream is required for the non-Avro input, then we can provide a non-buffered InputStream that wraps a BinaryDecoder. I don't see applications getting much more complicated by this.

          Show
          Doug Cutting added a comment - > This should work most of the time except when using a validating or resolving decoders, where the usage becomes somewhat cumbersome. How would it be significantly more cumbersome than it is currently? Instead of performing non-Avro IO on the InputStream that you passed to ValidatingDecoder or ResolvingDecoder you'd perform non-Avro IO on the BinaryDecoder that you passed these. If an InputStream is required for the non-Avro input, then we can provide a non-buffered InputStream that wraps a BinaryDecoder. I don't see applications getting much more complicated by this.
          Hide
          Thiruvalluvan M. G. added a comment -

          More generally, since BinaryDecoder has a read(byte[]) method, it is possible to implement a Java InputStream on top of a Decoder if one needed to perform other non-Avro input from the same source. So I cannot currently imagine an application that could not work with this. Am I missing something?

          This should work most of the time except when using a validating or resolving decoders, where the usage becomes somewhat cumbersome. Since constructing the grammar is relative expensive, people cache the validating and resolving decoders. Whenever you want to read avro data out of an input stream, you call init(inputStream) on the cached ValidatingDecoder(or ResolvingDecoder) and start reading data off the decoder. Now after consuming the avro data, you can't directly use the input stream you just passed to read non-avro data, but fish out the BinaryDecoder wrapper from within the cachced ValidatingDecoder. I think it's counter-intuitive and hard to explain.

          Show
          Thiruvalluvan M. G. added a comment - More generally, since BinaryDecoder has a read(byte[]) method, it is possible to implement a Java InputStream on top of a Decoder if one needed to perform other non-Avro input from the same source. So I cannot currently imagine an application that could not work with this. Am I missing something? This should work most of the time except when using a validating or resolving decoders, where the usage becomes somewhat cumbersome. Since constructing the grammar is relative expensive, people cache the validating and resolving decoders. Whenever you want to read avro data out of an input stream, you call init(inputStream) on the cached ValidatingDecoder(or ResolvingDecoder) and start reading data off the decoder. Now after consuming the avro data, you can't directly use the input stream you just passed to read non-avro data, but fish out the BinaryDecoder wrapper from within the cachced ValidatingDecoder. I think it's counter-intuitive and hard to explain.
          Hide
          Doug Cutting added a comment -

          > If we have a stream with metadata and data (like our DataFile) and if the metadata is encoded in non-avro format we'll have trouble with buffering.

          The current DataFile implementation uses a Decoder for all input, so the only problem there would be that we'd need to reset the buffer when seeking the underlying input stream, which would not be hard.

          More generally, since BinaryDecoder has a read(byte[]) method, it is possible to implement a Java InputStream on top of a Decoder if one needed to perform other non-Avro input from the same source. So I cannot currently imagine an application that could not work with this. Am I missing something?

          So, we have two candidate approaches:

          • Add a new class, BufferredBinaryDecoder, and change internal uses (like DataFileReader) to use it, but retain the non-buffering BinaryDecoder for applications that need to intermix Avro and other input.
          • add buffering to BinaryDecoder and perhaps provide a non-buffering InputStream implementation that wraps a BinaryDecoder, for applications that intermix Avro and other data.

          I'm currently leaning towards the latter approach.

          ValidatingDecoder and ResovlingDecoder each wrap a Decoder. If they're wrapping a BinaryDecoder that buffers, and an application intermixes Avro and non-Avro data, then that application would need to use the buffered BinaryDecoder for its input, not the ValidatingDecoder or the ResolvingDecoder, since those accept only call sequences in accord with a declared schema.

          Show
          Doug Cutting added a comment - > If we have a stream with metadata and data (like our DataFile) and if the metadata is encoded in non-avro format we'll have trouble with buffering. The current DataFile implementation uses a Decoder for all input, so the only problem there would be that we'd need to reset the buffer when seeking the underlying input stream, which would not be hard. More generally, since BinaryDecoder has a read(byte[]) method, it is possible to implement a Java InputStream on top of a Decoder if one needed to perform other non-Avro input from the same source. So I cannot currently imagine an application that could not work with this. Am I missing something? So, we have two candidate approaches: Add a new class, BufferredBinaryDecoder, and change internal uses (like DataFileReader) to use it, but retain the non-buffering BinaryDecoder for applications that need to intermix Avro and other input. add buffering to BinaryDecoder and perhaps provide a non-buffering InputStream implementation that wraps a BinaryDecoder, for applications that intermix Avro and other data. I'm currently leaning towards the latter approach. ValidatingDecoder and ResovlingDecoder each wrap a Decoder. If they're wrapping a BinaryDecoder that buffers, and an application intermixes Avro and non-Avro data, then that application would need to use the buffered BinaryDecoder for its input, not the ValidatingDecoder or the ResolvingDecoder, since those accept only call sequences in accord with a declared schema.
          Hide
          Thiruvalluvan M. G. added a comment -

          Are there important use cases where buffering is a problem?

          If we have a stream with metadata and data (like our DataFile) and if the metadata is encoded in non-avro format we'll have trouble with buffering. One reason users may want to encode the metadata in non-avro format is that the stream contains avro- and non-avro data.

          ... call Scott's implementation BufferedBinaryDecoder and encourage folks to use that unless buffering is a problem. Could that work?

          +1 for this proposal. We can add a method reset() in the Decoder interface. This method would throw away the remaining contents of the buffer in BufferedBinaryDecoder and do nothing in BinaryDecoder. Other decoders like ValidatingDecoder will pass this call to their underlying Decoders.

          The main problem with BinaryDecoder is that it does single byte reads. One solution for that problem could be to encode the count of bytes for a number in the first byte itself. For example, a bit prefix 0 may indicate that it's a 7-it number, a prefix of 10 may indicate it's a 14-bit number (6 remaining bits of the first byte and 8 bits of the second byte) and so on. This will be as efficient as the current encoding scheme. We'd need at most two reads per encoded number. The performance will not be as great as buffered decoder, but better than BinaryDecoder. The big drawback of this scheme is that it is an incompatible change. I don't think it's worthwhile to implement it.

          Show
          Thiruvalluvan M. G. added a comment - Are there important use cases where buffering is a problem? If we have a stream with metadata and data (like our DataFile) and if the metadata is encoded in non-avro format we'll have trouble with buffering. One reason users may want to encode the metadata in non-avro format is that the stream contains avro- and non-avro data. ... call Scott's implementation BufferedBinaryDecoder and encourage folks to use that unless buffering is a problem. Could that work? +1 for this proposal. We can add a method reset() in the Decoder interface. This method would throw away the remaining contents of the buffer in BufferedBinaryDecoder and do nothing in BinaryDecoder. Other decoders like ValidatingDecoder will pass this call to their underlying Decoders. The main problem with BinaryDecoder is that it does single byte reads. One solution for that problem could be to encode the count of bytes for a number in the first byte itself. For example, a bit prefix 0 may indicate that it's a 7-it number, a prefix of 10 may indicate it's a 14-bit number (6 remaining bits of the first byte and 8 bits of the second byte) and so on. This will be as efficient as the current encoding scheme. We'd need at most two reads per encoded number. The performance will not be as great as buffered decoder, but better than BinaryDecoder. The big drawback of this scheme is that it is an incompatible change. I don't think it's worthwhile to implement it.
          Hide
          Scott Carey added a comment -

          I note that doReadBytes(), once the buffer is exhausted, never calls in.read(bytes, start, length). Shouldn't we bypass the buffer in this case?

          Yes, there are probably a few bugs / improvements in there and some other cleanup. I'd want to have some unit tests probe certain corner cases such as reading bytes or string types larger than the buffer.


          This changes semantics a bit, since currently BinaryDecoder does not buffer. For example, in DataFileReader seeks the underlying stream. So, in that case, we'll need to be able to reset the Decoder's buffer after a seek. There could also be places where an application reads alternately from the Decoder and its underlying InputStream, although, at a glance, I can't find any such in the code currently.

          I am not familiar with all the use cases to judge what the impact of these changes are. I can say that I'm certain that users will have data that mixes Avro with other data, whether that is a file or another source. Perhaps FileChannel with ByteBuffers is the best way to support that for files.

          The big thing here is that InputStream (and also OutputStream) perform very poorly when writing or reading small chunks. java.nio.Buffer is a lot better. Raw byte[] access is the fastest. Whatever is done in the long run will have to avoid using InputStream and OutputStream at a fine grained level. Maybe that means using the nio api rather than the io api.

          Show
          Scott Carey added a comment - I note that doReadBytes(), once the buffer is exhausted, never calls in.read(bytes, start, length). Shouldn't we bypass the buffer in this case? Yes, there are probably a few bugs / improvements in there and some other cleanup. I'd want to have some unit tests probe certain corner cases such as reading bytes or string types larger than the buffer. This changes semantics a bit, since currently BinaryDecoder does not buffer. For example, in DataFileReader seeks the underlying stream. So, in that case, we'll need to be able to reset the Decoder's buffer after a seek. There could also be places where an application reads alternately from the Decoder and its underlying InputStream, although, at a glance, I can't find any such in the code currently. I am not familiar with all the use cases to judge what the impact of these changes are. I can say that I'm certain that users will have data that mixes Avro with other data, whether that is a file or another source. Perhaps FileChannel with ByteBuffers is the best way to support that for files. The big thing here is that InputStream (and also OutputStream) perform very poorly when writing or reading small chunks. java.nio.Buffer is a lot better. Raw byte[] access is the fastest. Whatever is done in the long run will have to avoid using InputStream and OutputStream at a fine grained level. Maybe that means using the nio api rather than the io api.
          Hide
          Doug Cutting added a comment -

          In AVRO-315, Thiru said, "it's desirable to keep the property that the decoder does not take more bytes off the input stream than necessary. That property will allow the input stream not rewindable."

          Are there important use cases where buffering is a problem? I can't think of any, but if there are then we might support both: call Scott's implementation BufferedBinaryDecoder and encourage folks to use that unless buffering is a problem. Could that work?

          Show
          Doug Cutting added a comment - In AVRO-315 , Thiru said, "it's desirable to keep the property that the decoder does not take more bytes off the input stream than necessary. That property will allow the input stream not rewindable." Are there important use cases where buffering is a problem? I can't think of any, but if there are then we might support both: call Scott's implementation BufferedBinaryDecoder and encourage folks to use that unless buffering is a problem. Could that work?

            People

            • Assignee:
              Thiruvalluvan M. G.
              Reporter:
              Thiruvalluvan M. G.
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development