Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3961

java.io.ByteArrayOutputStream unable to allocate byte array with Integer.MAX_VALUE size

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • Impala 2.6.0
    • None
    • Catalog
    • None

    Description

      Our limitation for serializing catalog update size is not 2GB, which has been commonly believed so.

      The observed behavior is

      java.lang.OutOfMemoryError: Requested array size exceeds VM limit
      

      Looking at /usr/lib/jvm/jdk1.7.0_75/src.zip!/java/io/ByteArrayOutputStream.java on my system, the line that triggers this oom is:

          private void grow(int minCapacity) {
              // overflow-conscious code
              int oldCapacity = buf.length;
              int newCapacity = oldCapacity << 1;
              if (newCapacity - minCapacity < 0)
                  newCapacity = minCapacity;
              if (newCapacity < 0) {
                  if (minCapacity < 0) // overflow
                      throw new OutOfMemoryError();
                  newCapacity = Integer.MAX_VALUE;
              }
              buf = Arrays.copyOf(buf, newCapacity);           <----------- This line
          }
      
          public static byte[] copyOf(byte[] original, int newLength) {
              byte[] copy = new byte[newLength];           <------------ OOM
              System.arraycopy(original, 0, copy, 0,
                               Math.min(original.length, newLength));
              return copy;
          }
      

      Note, how the second line above that line newCapacity = Integer.MAX_VALUE; tries to prevent this from happening, but ironically, it wont necessarily work. I tried to do

      byte[] test = new byte[Integer.MAX_VALUE];
      

      and it OOMs the same way right away. (it succeeds at byte[] test = new byte[Integer.MAX_VALUE-2];) -1 also ooms.

      This puts an limitation of 1GB update size on us because the default ByteArrayOutputStream size from Thrift is 32bytes(power of 2), and it will fail when its capacity grows from 1GB to 2GB. (if the update is 1GB, it will OOM.) it is guaranteed to hit 1GB, since it grows by doubling and 1GB is also a power of 2. so essentially, capacity is halved.

      To expand our capacity(scalability) by x2, I think we can patch Thrift to use a "better" ByteArrayOutputStream initial size that grows more friendly when it doubles to avoid this limitation at 1GB. (assume jvm is beyond our control)

      This will bring impala to IMPALA-2648.

      Attachments

        Issue Links

          Activity

            People

              HuaisiXu Huaisi Xu
              HuaisiXu Huaisi Xu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: