Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Affects Version/s: 10.2.1.6
    • Fix Version/s: 10.2.1.6
    • Component/s: SQL
    • Labels:
      None

      Description

      Suppose I have 1) a table "t1" with blob data in it, and 2) an UPDATE trigger "tr1" defined on that table, where the triggered-SQL-action for "tr1" does NOT reference any of the blob columns in the table. [ Note that this is different from DERBY-438 because DERBY-438 deals with triggers that do reference the blob column(s), whereas this issue deals with triggers that do not reference the blob columns--but I think they're related, so I'm creating this as subtask to 438 ]. In such a case, if the trigger is fired, the blob data will be streamed into memory and thus consume JVM heap, even though it (the blob data) is never actually referenced/accessed by the trigger statement.

      For example, suppose we have the following DDL:

      create table t1 (id int, status smallint, bl blob(2G));
      create table t2 (id int, updated int default 0);
      create trigger tr1 after update of status on t1 referencing new as n_row for each row mode db2sql update t2 set updated = updated + 1 where t2.id = n_row.id;

      Then if t1 and t2 both have data and we make a call to:

      update t1 set status = 3;

      the trigger tr1 will fire, which will cause the blob column in t1 to be streamed into memory for each row affected by the trigger. The result is that, if the blob data is large, we end up using a lot of JVM memory when we really shouldn't have to (at least, in theory we shouldn't have to...).

      Ideally, Derby could figure out whether or not the blob column is referenced, and avoid streaming the lob into memory whenever possible (hence this is probably more of an "enhancement" request than a bug)...

        Issue Links

          Activity

          Hide
          A B added a comment -

          Attaching a repro for the problem. I couldn't find a clean way to show the extra memory usage, so this program runs in two stages. First you have to load the data (with default JVM heap size), and then you run the test while specifying a limited JVM heap size (4M is what works for me; not sure if that will change for other people). When running the test, the program first executes a set of update statements that fire triggers on a table withOUT blob columns. Then it does the exact same thing on a table WITH blob columns. Ideally, the memory usage for the two scenarios should be similar (Derby would know that the blobs aren't used and so wouldn't stream them into memory)--but what will happen with this program is that an "OutOfMemory" exception will occur in the second scenario, because the streaming blobs take up all of the (limited) JVM heap.

          To run:

          First:
          java d442 load

          Then:
          java -Xmx4m d442 run // "4m" is small enough to show the error, but large enough to let Derby boot...

          Show
          A B added a comment - Attaching a repro for the problem. I couldn't find a clean way to show the extra memory usage, so this program runs in two stages. First you have to load the data (with default JVM heap size), and then you run the test while specifying a limited JVM heap size (4M is what works for me; not sure if that will change for other people). When running the test, the program first executes a set of update statements that fire triggers on a table withOUT blob columns. Then it does the exact same thing on a table WITH blob columns. Ideally, the memory usage for the two scenarios should be similar (Derby would know that the blobs aren't used and so wouldn't stream them into memory)--but what will happen with this program is that an "OutOfMemory" exception will occur in the second scenario, because the streaming blobs take up all of the (limited) JVM heap. To run: First: java d442 load Then: java -Xmx4m d442 run // "4m" is small enough to show the error, but large enough to let Derby boot...
          Hide
          Manish Khettry added a comment -

          If we have triggers on a table, we end up reading all the columns from the base table. This bit of code in UpdateNode#getUpdateReadMap, says:

          /*

            • If we have any triggers, then get all the columns
            • because we don't know what the user will ultimately
            • reference.
              */
              baseTable.getAllRelevantTriggers( StatementType.UPDATE, changedColumnIds, relevantTriggers );
              if ( relevantTriggers.size() > 0 ) { needsDeferredProcessing[0] = true; }

          if (relevantTriggers.size() > 0)
          {
          for (int i = 1; i <= columnCount; i++)

          { columnMap.set(i); }

          }

          If we want to be smart and not read the columns which are not needed by the trigger, the trigger descriptor and the system table will have to remember which columns are referenced by the trigger. Confusingly, the referencedcolumns column of SYSTRIGGERS actually contains the triggering columns!

          Show
          Manish Khettry added a comment - If we have triggers on a table, we end up reading all the columns from the base table. This bit of code in UpdateNode#getUpdateReadMap, says: /* If we have any triggers, then get all the columns because we don't know what the user will ultimately reference. */ baseTable.getAllRelevantTriggers( StatementType.UPDATE, changedColumnIds, relevantTriggers ); if ( relevantTriggers.size() > 0 ) { needsDeferredProcessing[0] = true; } if (relevantTriggers.size() > 0) { for (int i = 1; i <= columnCount; i++) { columnMap.set(i); } } If we want to be smart and not read the columns which are not needed by the trigger, the trigger descriptor and the system table will have to remember which columns are referenced by the trigger. Confusingly, the referencedcolumns column of SYSTRIGGERS actually contains the triggering columns!
          Hide
          Satheesh Bandaram added a comment -

          Thanks Manish for some research on this. Yes, this has been a known troublesome area for sometime. The problem gets worse if there are any BLOB/CLOB columns in the table. Even if these columns are not referenced in the trigger, current logic materializes the whole BLOB/CLOB into memory.

          According to Derby documentation, REFERENCEDCOLUMNS is ' true descriptor of the columns referenced by UPDATE triggers'. REFERENCEDCOLUMNS is used to keep a list of columns that trigger the update. What you are proposing need another column that keeps track of all columns used in the trigger, not just triggering columns. That would be good to avoid materializing unreferenced columns, but we may also need to avoid blob/clob materialization to make them stream on-demand. There is another issue with supporting BLOB/CLOBs in triggers.

          Show
          Satheesh Bandaram added a comment - Thanks Manish for some research on this. Yes, this has been a known troublesome area for sometime. The problem gets worse if there are any BLOB/CLOB columns in the table. Even if these columns are not referenced in the trigger, current logic materializes the whole BLOB/CLOB into memory. According to Derby documentation, REFERENCEDCOLUMNS is ' true descriptor of the columns referenced by UPDATE triggers'. REFERENCEDCOLUMNS is used to keep a list of columns that trigger the update. What you are proposing need another column that keeps track of all columns used in the trigger, not just triggering columns. That would be good to avoid materializing unreferenced columns, but we may also need to avoid blob/clob materialization to make them stream on-demand. There is another issue with supporting BLOB/CLOBs in triggers.
          Hide
          Daniel John Debrunner added a comment -

          Moved to DERBY-1482 as it is a stand-alone bug, not a sub-task of DERBY-438

          Show
          Daniel John Debrunner added a comment - Moved to DERBY-1482 as it is a stand-alone bug, not a sub-task of DERBY-438

            People

            • Assignee:
              Unassigned
              Reporter:
              A B
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development