Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2648

catalogd crashes when serialized messages are over 2 GB

    XMLWordPrintableJSON

Details

    Description

      We've seen a catalogd crash triggered by loading the metadata for a table with about 20K partitions and 77 columns that has incremental stats. It looks like the serialized message is over 2GB which is the Java max array size. Ideally we should catch this exception and fail the query that needs this table's metadata with an appropriate message.

      I1107 06:47:56.641507 30252 jni-util.cc:177] java.lang.OutOfMemoryError: Requested array size exceeds VM limit
      at java.util.Arrays.copyOf(Arrays.java:2271)
      at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
      at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
      at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
      at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
      at org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:187)
      at com.cloudera.impala.thrift.THdfsPartition$THdfsPartitionStandardScheme.write(THdfsPartition.java:1831)
      at com.cloudera.impala.thrift.THdfsPartition$THdfsPartitionStandardScheme.write(THdfsPartition.java:1543)
      at com.cloudera.impala.thrift.THdfsPartition.write(THdfsPartition.java:1389)
      at com.cloudera.impala.thrift.THdfsTable$THdfsTableStandardScheme.write(THdfsTable.java:1123)
      at com.cloudera.impala.thrift.THdfsTable$THdfsTableStandardScheme.write(THdfsTable.java:969)
      at com.cloudera.impala.thrift.THdfsTable.write(THdfsTable.java:848)
      at com.cloudera.impala.thrift.TTable$TTableStandardScheme.write(TTable.java:1628)
      at com.cloudera.impala.thrift.TTable$TTableStandardScheme.write(TTable.java:1395)
      at com.cloudera.impala.thrift.TTable.write(TTable.java:1209)
      at com.cloudera.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1241)
      at com.cloudera.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1098)
      at com.cloudera.impala.thrift.TCatalogObject.write(TCatalogObject.java:938)
      at com.cloudera.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:487)
      at com.cloudera.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:421)
      at com.cloudera.impala.thrift.TGetAllCatalogObjectsResponse.write(TGetAllCatalogObjectsResponse.java:365)
      at org.apache.thrift.TSerializer.serialize(TSerializer.java:79)
      at com.cloudera.impala.service.JniCatalog.getCatalogObjects(JniCatalog.java:110)
      

      You can identify this issue by looking at the metastore database. Here is how to see the size of the incremental stats for table id 12345. The table with this value of 624 MB of incremental stats led to the catalogd crash shown above.

      postgres=# select pg_size_pretty(sum(length("PARTITION_PARAMS"."PARAM_KEY") + length("PARTITION_PARAMS"."PARAM_VALUE"))) from "PARTITIONS", "PARTITION_PARAMS" where "PARTITIONS"."TBL_ID"=12345 and "PARTITIONS"."PART_ID" = "PARTITION_PARAMS"."PART_ID"  and "PARTITION_PARAMS"."PARAM_KEY" LIKE 'impala_intermediate%';
       pg_size_pretty 
      ----------------
       624 MB
      (1 row)
      

      Attachments

        Issue Links

          Activity

            People

              tianyiwang Tianyi Wang
              srus Silvius Rus
              Votes:
              2 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: