Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-13385

The response of ddl exceeds VM limit when serializing the catalog

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 4.0.0, Impala 4.1.0, Impala 4.2.0, Impala 4.1.1, Impala 4.1.2, Impala 4.3.0, Impala 4.4.0, Impala 4.4.1
    • None
    • Catalog
    • None
    • ghx-label-3

    Description

               At precent, when catalogd responds to DDL operations, it sends the entire table object. This can lead to a massive transfer of table catalog when dealing with the  hive partitioned table. In one of our customer's clusters, there is a hive partitioned table with over 4,000 columns, more than 20,000 partitions, and involving over 10 million hdfs files. When executing an `ALTER TABLE ADD PARTITION` operation on this table, the catalog being serialized for the table exceeds the java array size limit, resulting in the following exception: `java.long.OutOfMemoryError: Requested array size exceeds VM limit`.

              To alleviate the issue, we can use TCompactProtocol instead of TBinaryProtocol for thrift serialization. In an experiment with a hive table containing 160 partitions, I observed that using TCompactProtocol can reduce the serialized data size by 34.4% compared to the previous method.

              Here are potential solutions for addressing this issue:

              DDL operations only: Use TCompactProtocol for serializing table catalog during ExecDdl operations. This would involve fewer changes but requires adjustments to JniUtil.

              Global replacement with TCompactProtocol: Replace all serialization operations within Impala with TCompactProtocol. Although this a larger change, the overall code becomes cleaner. In 329 internal benchmark tests, I found no significant performance degradation compared to the previous implementation, and memory usage was reduce.

             Looking forward to any feedback.

       

       

      Attachments

        Issue Links

          Activity

            People

              qqzhang zhangqianqiong
              qqzhang zhangqianqiong
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: