Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11580

Memory leak in legacy catalog mode when applying incremental partition updates

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • Impala 4.0.0, Impala 4.1.0
    • Impala 4.2.0, Impala 4.1.1
    • Catalog
    • None
    • ghx-label-13

    Description

      Since IMPALA-3127, catalogd propagates incremental metadata updates in partition level. In the legacy catalog mode, while applying the updates, impalad reuses the existing partition objects and move them to a new HdfsTable object. However, the partition objects are immutable, which means their reference to the old table object remain unchanged. JVM cannot collect the stale table objects since they still have active reference from the partitions.

      To reproduce the issue, create a partitioned table and add new partitions to it in a rate closer to the catalog update frequency (2s by default):

      impala-shell> drop table if exists my_part_tbl;
      impala-shell> create external table my_part_tbl (id int) partitioned by (p int) stored as textfile;
      

      Add a partition every 2s:

      for i in `seq 1000`; do impala-shell.sh -q "alter table my_part_tbl add partition (p=$i)"; sleep 2; done
      

      Then monitor the live table objects in impalad JVM:

      for p in `pidof impalad`; do echo PID=$p; jmap -histo:live $p | grep 'org.apache.impala.catalog.HdfsTable$'; done
      

      You can see that only one impalad has the value unchanged. The number in the other 2 impalads keep bumping.

      $ for p in `pidof impalad`; do echo PID=$p; jmap -histo:live $p | grep 'org.apache.impala.catalog.HdfsTable$'; done
      PID=27677
       136:            14           3360  org.apache.impala.catalog.HdfsTable
      PID=27671
       136:            14           3360  org.apache.impala.catalog.HdfsTable
      PID=27668
       474:             1            240  org.apache.impala.catalog.HdfsTable
      
      $ for p in `pidof impalad`; do echo PID=$p; jmap -histo:live $p | grep 'org.apache.impala.catalog.HdfsTable$'; done
      PID=27677
       113:            21           5040  org.apache.impala.catalog.HdfsTable
      PID=27671
       113:            21           5040  org.apache.impala.catalog.HdfsTable
      PID=27668
       474:             1            240  org.apache.impala.catalog.HdfsTable
      

      This only happens in the legacy catalog mode and doesn't occur in the local-catalog mode. To workaround this, use the startup flag --enable_incremental_metadata_updates=false in catalogd to disable incremental catalog updates.

      Attachments

        Issue Links

          Activity

            People

              stigahuang Quanlong Huang
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: