Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-22757

Excessive memory usage in schema-related code in SQL

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0
    • sql

    Description

      I have the following test:

       

      class It1000TablesTest extends ClusterPerTestIntegrationTest {
          private static final String DEFAULT_STORAGE_ENGINE = "<default>";
      
          private final String storageEngine = "aipersist";
      
          private final int columnsCount = 200;
      
          @Override
          protected int initialNodes() {
              return 1;
          }
      
          @Test
          void test() {
              String storageProfile =
                      DEFAULT_STORAGE_ENGINE.equals(storageEngine) ? DEFAULT_STORAGE_PROFILE : "default_" + storageEngine.toLowerCase();
      
              String zoneSql = "create zone test_zone with replicas=1, storage_profiles='" + storageProfile + "';";
      
              cluster.doInSession(0, session -> {
                  executeUpdate(zoneSql, session);
              });
      
              for (int i = 0; i < 1000; i++) {
                  String tableName = String.format("table%03d", i);
      
                  String valColumns = columnNames()
                          .map(colName -> colName + " varchar(40)")
                          .collect(joining(", "));
      
                  String tableSql = "create table " + tableName + " (key int primary key, " + valColumns + ")"
                          + " with primary_zone='TEST_ZONE', storage_profile='" + storageProfile + "';";
      
                  String columnNames = columnNames().collect(joining(", "));
                  String values = IntStream.range(0, columnsCount)
                          .mapToObj(n -> UUID.randomUUID().toString())
                          .map(s -> "'" + s + "'")
                          .collect(joining(", "));
      
                  String insertSql = "insert into " + tableName + " (key, " + columnNames + ") values (" + i + ", " + values + ")";
      
                  cluster.doInSession(0, session -> {
                      executeUpdate(tableSql, session);
      
                      executeUpdate(insertSql, session);
                  });
      
                  int createdTables = i + 1;
      
                  if (createdTables % 1 == 0) {
                      log.info("XXX Created " + createdTables + " tables");
                  }
              }
          }
      
          private Stream<String> columnNames() {
              return IntStream.range(0, columnsCount)
                      .mapToObj(n -> String.format("val%03d", n));
          }
      }
      

      It just tries to create a 1000 of tables, 201 column each (sharing the same zone), making an insert to each of them after creating it.

       

      After creating about 200 tables I took a heap dump, here are the top consumers of the heap:

      There are just around tables, but 20k IgniteTableImpl instances and more than 4M CatalogColumnDescriptor instances. It feels like an arithmetic progression: sum of 1..200 gives (1+200)*100≈20000, as if addition of a new table made a copy of all existing tables as well.

      SqlSchemaManager caches tables by <tableId, catalogVersion> pair, so, if a catalog is modified N times in a way that does not concern a table, N copies of the table might be created in the cache (and they do get created). It seems natural to cache tables (maybe additionally to the existing caching) by <tableId, tableVersion>.

      Another problem is that, as the cache is bounded, it starts forgetting older instances, so they get recreated; but those older instances are actually used by some internal structures of Calcite, so, even if tables are properly cached by <tableId, tableVersion>, duplicates will emerge when anough tables are created.

      Attachments

        1. image-2024-07-17-14-13-29-683.png
          108 kB
          Roman Puchkovskiy

        Issue Links

          Activity

            People

              mzhuravkov Maksim Zhuravkov
              rpuch Roman Puchkovskiy
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m