[IMPALA-13438] In alterTableRecoverPartitions, we should batch the addHmsPartitions operations. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: Impala 4.1.0, Impala 4.2.0, Impala 4.1.1, Impala 4.1.2, Impala 4.3.0, Impala 4.4.0, Impala 4.4.1
Fix Version/s: None
Component/s: Catalog
Labels:
None

Epic Color:
ghx-label-1

Description

After applying the merge request '~~IMPALA-10502~~: Handle CREATE/DROP events correctly', the alterTableRecoverPartitions method changed from batching the add_partitions calls to invoking addHmsPartitions all at once. However, for tables with a huge number of partitions, this can result in the creation of a huge temporary object, List<Partitions>, leading to OutOfMemory.

In my test environment, where the catalogd JVM Xmx was set to 2GB, running the end-to-end test custom_cluster/test_wide_table_operations.py on a table with 2000 columns and 50,000 partitions during the recover partitions operation caused catalogd to run into a Java heap space OutOfMemoryError.

An analysis of the memory dump using the MemoryAnalyzer revealed that the temporary object contained a massive number of FieldSchema objects (2000 columns * 50,000 partitions), which overwhelmed memory resources.

To resolve this issue, we propose batching the addHmsPartitions calls, ensuring that temporary objects are released after each batch operation. This solution was tested and verified to resolve the OutOfMemoryError, ensuring system stability when handling a large number of partitions.

Attachments

Activity

People

Assignee:: zhangqianqiong

Reporter:: zhangqianqiong

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 11/Oct/24 09:58

Updated:: 12/Oct/24 21:45