Scrollable Updatable ResultSets (Embedded)

Authors: Andreas Korneliussen, Fernanda Pizzorno, Dag H. Wanvik

Summary

This write-up covers the main patch for embedded SUR (DERBY-690), sans holdability (will be covered by a later patch), as will SUR for the client driver (DERBY-775). It follows closely the specification uploaded for that issue; the open issue in the specification document on conflict handling is resolved, see the section on conflict handling. First, we give an overview of the implementation approach, then give more detailed comments covering the changed files.

Motivation

Result sets that are scrollable and updatable have been a part of the JDBC specification since JDBC 2.0. Users have been requesting this features in the Derby user list (derby-user@db.apache.org) and it is desirable to implement it.

Chosen approach

Derby already implements result sets of type TYPE_SCROLL_INSENSITIVE and concurrency CONCUR_READ_ONLY as well as result sets of type TYPE_FORWARD_ONLY and concurrencies CONCUR_READ_ONLY and CONCUR_UPDATABLE. We have chosen to combine these two implementations in order to add updatability capabilities to todays implementation of scrollable insensitive read-only reasult sets.

Current scrollable insensitive read-only result set implementation

Scrollable read-only result sets in Derby are implemented in a ResultSet class called ScrollInsensitiveResultSet. This class holds a reference to a source ResultSet that provides it with rows, and a reference to BackingStoreHashtable where these rows are cached. This hashtable uses a key object of the type SQLInteger which represents the position in the ResultSet. Since the hashcode of SQLInteger equals the value itself, the hashtable behaves like a Vector, in terms of perfomance and use. The structure of the hash table in illustrated in figure 1.

Key(SQLInt)Position of the row in the resultSet
Row(DataValueDescriptor[])Data
Figure 1: Structure of the hash table used in scrollable insensitive read-only result set.

When navigating to a row that has not previously been visited, all rows between the last visited row and the desired row will be (a) fetched from the source, and (b) stored in the hashtable; and when navigating to a previously fetched row, this row will be fetched directly from the hash table using the hashtable's key object the row's position in the result set.

Scrollable insensitive updatable implementation

Scrollable insensitive updatable result set is also implemented in the class called ScrollInsensitiveResultSet. This class has been changed so that if the requested result set is updatable it will enable its updatability and detectability features, and if the requested result set is read-only, then the behaviour will be unchanged.

Updatability

Derby uses RowLocation to identify rows for fetches, deletes, and updates through a conglomerate controller. In order to perform one of these operation to a specific row it is necessary to know the its RowLocation to be able to navigate to it. By adding the RowLocation information of each row to the hashtable used in ScrollInsensitiveResultSet class, it is possible to enable updatability to scrollable insensitive result sets. The new structure of the hashtable is shown in figure 2. NOTE: this field is only added if the result set is updatable.

Key(SQLInt)Position of the row in the resultSet
rLoc(RowLocation)RowLocation of the row
Row(DataValueDescriptor[])Data
Figure 2: The field rLoc was added to the the hash table used in scrollable insensitive result set in order to make it possible to position the cursor on a specific row.

Besides adding this new column to the hashtable, it is also necessary to augment the behaviour of the navigation somewhat: This is because prior to SUR, in forward only result sets or with read-only scrollable insensitive result sets, the rows were fetched once, sequentially. Even when repositioning (to a previously fecthed row) in a read-only insensitive result set, the underlying result set's position did not change (one just "repositions" in the hashtable), so fetching more rows from the underlying result set could resume directly as needed.

In contrast, when the result set is updatable, the position of the underlying result set needs to be changed, so that an update or delete operation can take place. The stored rowLocation is used for this purpose, cf NoPutResultSet#positionScanAtRowLocation. Similarly, when navigating to a row that has not yet been fetched, it is necessary to (re)position the cursor back to the RowLocation of the last fetched row before the fetch starts.

The update or delete is performed using positioned SQL statements as it has been done for forward only ResultSets. The RowLocation is used when navigating to position the scan controller at the correct row, and aquiring locks when necessary. Since the scan controller is positioned on the correct row, and the necessary locks are aquired, it is possible to perform a positioned SQL statement for both deletes and updates in the same way as for forward-only result sets.

The associated work to be performed on an update or delete, such as firing triggers, enforcing constraints etc, is done in the same way as for other positioned updates. Changes to the row caused by the execution of triggers will not be reflected in the result set.

Conflict handling

The ScrollInsensitieve ResultSet will be populated with by getting rows from another ResultSet which can only move forward (a source ResultSet). While the source ResultSet moves forward, it will set update locks to rows it returns to the ScrollInsensitiveResultSet. Once the source ResultSet has returned a row to the ScrollInsensitiveResultSet, the lock may be released when the user repositions, depending on isolation level.

Isolation level read-uncommitted / read-committed:
The row lock (U) for the visited row is released as soon as the user navigates to another position. If the user renavigates to the row, the lock will be reset, however the data in the resultset will not be refreshed. If the user updates the row, a write lock will be requested, and kept for the duration of the transaction.

The row for the RowLocation can (if not locked) be updated by another transaction while the result set is open. So, what happens when the row is attempted updated from the current transaction via the result set?

The set of columns touched by an updateXXX method will change their value in the underlying data base and in the result set (as given in the updateXXX methods). Other columns will not change their value in the underlying database, even if they (now) have a different value than presently visible in the result set (insensitive).

That is, any changes made by other transactions to columns touched by updateXXX methods in the row (after it was first fetched for insensitive the result) will be silently overwritten.

Note that these effects will also apply to rows changed by our own transaction if made via separate UPDATE statements, no matter what isolation level is active.

If such effects are an issue, the user could increase the isolation level (and/or make own transaction behave!) For Derby, at some point in the future we might want to add updatable sensitive result sets and/or add some syntax to allow extra locking for rows in updatable result set.

Isolation level repeatable read / serializable:

The row lock (U) is kept until the transaction is committed, excluding other transactions from updating or deleting the row. If the user updates the row, the lock will be upgraded to a write lock, and kept for the duration of the transaction.

All isolation levels

If a transaction attempts to delete a row that have previously been deleted (either in the same trasaction, or by another transaction) a warning will be added to the result set (for an updateRow or deleteRow call) or to the statement (for positioned SQL statements), which will also return zero row count.

Visibility and detectability of changes

Whether a result set is able to view and detect changes made by the result set itself and changes made by other statements outside the result set may vary according to the implementation. Metadata calls can be used to inquire about the exact behavior of a particular implementation. The following DatabaseMetadata methods determine the behavior of a result set when it matter detectability. We have chosen to have the following behaviour for scrollable insensitive updatable result sets:

Metadata callReturned value
ownDeletesAreVisible(TYPE_SCROLL_INSENSITIVE)true
ownInsertsAreVisible(TYPE_SCROLL_INSENSITIVE)false
ownUpdatesAreVisible(TYPE_SCROLL_INSENSITIVE)true

othersDeletesAreVisible(TYPE_SCROLL_INSENSITIVE)false
othersInsertsAreVisible(TYPE_SCROLL_INSENSITIVE)false
othersUpdatesAreVisible(TYPE_SCROLL_INSENSITIVE)false

deletesAreDetected(TYPE_SCROLL_INSENSITIVE)true
insertsAreDetected(TYPE_SCROLL_INSENSITIVE)false
updatesAreDetected(TYPE_SCROLL_INSENSITIVE)true

others means other transactions or other object (such as Statement
or ResultSet objects) in the same transaction.

The metadata calls above show that we have chosen to an implementation of result sets that are able to see and detect own updates and deletes. As for the visibility of own changes, the visibility of own updates means that calling a getter method after an updater method has been called will return the new values for a column, while the visibility of own deletes means that deleted row will be replaced by a blank row. Detectability of own changes enable the result set to repond correctly to the rowDeleted() and rowUpdated() calls.

In order to be able to view own changes, whenever the updateRow() method is called, the hashtable has to be updated with the new values for the updated columns, and whenever the deleteRow() method is called the deleted row must be replaced by an empty row. To enable detectability of own changes, there must exists a mechanism to flag a row as updated or deleted. This has been done by adding to new columns of type boolean to the hashtable as illustrated by figure 3. NOTE: this field is only added if the result set is updatable.

Key(SQLInt)Position of the row in the resultSet
rLoc(RowLocation)RowLocation of the row
Deleted(boolean)indicates whether the row has been deleted
Updated(boolean)indicates whether the row has been updated
Row(DataValueDescriptor[])Data
Figure 3: Fields Updated and Deleted were added to the hash table used in scrollable insensitive result sets to enable detectability of changes

Two methods have been added to NoPutResultSet, updateRow(ExecRow row) and markRowAsDeleted(), in order to view own changes. They are called from UpdateResultSet#collectAffectedRows() and UpdateResultSet#updateDeferredRows(), and DeleteResultSet#collectAffectedRows() and DeleteResultSet#updateDeferredRows() respectively and propagate the changes done to the current row to ScrollInsensitiveResultSet, so that the hashtable can be updated.

NOTE: Own changes means changes made by the result set itself, while other changes means changes made by other transactions or other objects in the same transaction. Since Derby uses positioned updates (which actually is an other Statement in the same transaction" to implement deleteRow and updateRow in both Embedded and Client drivers, we have chosen to extend the meaning or own changes so that it also includes changes made through positioned updates.

Changes to the code

Changes to ScrollInsensitiveResultSet

Changes to UpdateResultSet

Changes to DeleteResultSet

Changes to ProjectRestrictResultSet

Changes to TableScanResultSet

Changes to IndexRowToBaseRowResultSet

Changes to CurrentOfResultSet

Changes to NoPutResultSet

Changes to access layer

Metadata

Others