Avro
  1. Avro
  2. AVRO-1111

Malformed data can cause OutOfMemoryError in Avro IPC

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.3
    • Fix Version/s: 1.7.2
    • Component/s: java
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If the data that comes in through the Netty channel buffer is not framed correctly/is not valid Avro data, then the incoming data can cause arbitrarily large array lists to be created, causing OutOfMemoryError.

      The relevant code(org.apache.avro.ipc.NettyTransportCodec):

      private boolean decodePackHeader(ChannelHandlerContext ctx, Channel channel,
      ChannelBuffer buffer) throws Exception {
      if (buffer.readableBytes()<8)

      { return false; }

      int serial = buffer.readInt();
      listSize = buffer.readInt();
      dataPack = new NettyDataPack(serial, new ArrayList<ByteBuffer>(listSize));
      return true;
      }

      If the buffer does not have valid Avro data, the listSize variable can have arbitrary values, causing massive ArrayLists to be created, leading to OutOfMemoryErrors.

      1. AVRO-1111-1.patch
        3 kB
        Mike Percy
      2. AVRO-1111-2.patch
        3 kB
        Mike Percy

        Issue Links

          Activity

          Hide
          Mike Percy added a comment -

          I think this patch is more limited but it's a first pass. What it does it try to calculate whether an OOM will likely result from the listSize allocation from the RPC header field, and if so throw an AvroRemoteException. It also closes the socket.

          I think it makes more sense to set an arbitrary limit on the listSize and check for that since we should throw at closer to 1000 instead of like 7 billion or something. But I'm not familiar enough with the Responder implementation to know whether something like that is a bad idea or not... would there be repercussions to that?

          There are also other places where we should do some validation, such as the length headers before the byte buffers... but that is a little tricky too (arbitrary limits can be problematic for obvious reasons). I'd like some suggestions for how to handle this as the current state leaves the protocol pretty brittle.

          Show
          Mike Percy added a comment - I think this patch is more limited but it's a first pass. What it does it try to calculate whether an OOM will likely result from the listSize allocation from the RPC header field, and if so throw an AvroRemoteException. It also closes the socket. I think it makes more sense to set an arbitrary limit on the listSize and check for that since we should throw at closer to 1000 instead of like 7 billion or something. But I'm not familiar enough with the Responder implementation to know whether something like that is a bad idea or not... would there be repercussions to that? There are also other places where we should do some validation, such as the length headers before the byte buffers... but that is a little tricky too (arbitrary limits can be problematic for obvious reasons). I'd like some suggestions for how to handle this as the current state leaves the protocol pretty brittle.
          Hide
          Philip Zeyliger added a comment -

          Mike, your patch seems like a pragmatic approach. I'm +1 on the patch. You might be even more conservative: 10% of the maximum memory seems like more than large enough. Even 100MB seems large enough.

          There are two other places we could annotate this sort of information. We could annotate the protocol description to say "maxSize=100MB", to limit the size of arrays. That requires a protocol change, and it also requires keeping track of the size of requests, which is tricky in its own way.

          Another approach is to pass a "max size" to the transceiver when instantiating it. An application might be able to say "never accept RPCs > 100MB" (in fact, that's a reasonable default). If an application wants to use larger ones, it can configure the server appropriately, thereby bypassing the check.

          Thoughts on these alternatives?

          Show
          Philip Zeyliger added a comment - Mike, your patch seems like a pragmatic approach. I'm +1 on the patch. You might be even more conservative: 10% of the maximum memory seems like more than large enough. Even 100MB seems large enough. There are two other places we could annotate this sort of information. We could annotate the protocol description to say "maxSize=100MB", to limit the size of arrays. That requires a protocol change, and it also requires keeping track of the size of requests, which is tricky in its own way. Another approach is to pass a "max size" to the transceiver when instantiating it. An application might be able to say "never accept RPCs > 100MB" (in fact, that's a reasonable default). If an application wants to use larger ones, it can configure the server appropriately, thereby bypassing the check. Thoughts on these alternatives?
          Hide
          Mike Percy added a comment -

          Phil, thank you very much for the ideas! I agree with trying harder to count the actual received bytes (deep count). Controlling the max-size on the server, while not requiring the client to provide it, looks more future proof to me.

          Show
          Mike Percy added a comment - Phil, thank you very much for the ideas! I agree with trying harder to count the actual received bytes (deep count). Controlling the max-size on the server, while not requiring the client to provide it, looks more future proof to me.
          Hide
          Tom White added a comment -

          +1 for the patch (with Philip's 10% suggestion). I would also change the test to assert that no response was received, rather than just printing a message.

          Show
          Tom White added a comment - +1 for the patch (with Philip's 10% suggestion). I would also change the test to assert that no response was received, rather than just printing a message.
          Hide
          Mike Percy added a comment -

          Philip / Tom,
          I've updated the patch to fail @ 10% of available memory, and added the Assert to the unit test.

          Show
          Mike Percy added a comment - Philip / Tom, I've updated the patch to fail @ 10% of available memory, and added the Assert to the unit test.
          Hide
          Tom White added a comment -

          I just committed this. Thanks Mike!

          Show
          Tom White added a comment - I just committed this. Thanks Mike!
          Hide
          Mike Percy added a comment -

          Thanks Tom!

          Show
          Mike Percy added a comment - Thanks Tom!

            People

            • Assignee:
              Mike Percy
              Reporter:
              Hari Shreedharan
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development