|
> The new network client will always request to receive the LOB columns as LOB Locator values instead of the data values bytes.
Is it always desirable for the client to request locator handling of LOBs? It seems like you're trading off reduced/controlled memory handling (by using locators) versus fewer client/server network message exchanges (by the current technique of sending all the data from the server to the client in the initial data transfer). That is, would there be situations in which a client might prefer not to use the locator technique, for performance reasons? > That is, would there be situations in which a client might prefer not to use the locator
> technique, for performance reasons? Yes, when dealing with small LOBs for example, it may be preferable to send all the data from the server to the client in the initial data transfer instead of using the locator technique. The attached file (LOBLocator.pdf) contains a few use cases of the implementation I proposed.
I came across this interesting bug report from another DBMS:
http://support.microsoft.com/kb/824106 What steps should our implementation take to ensure that we don't have such a problem? Reading the result columns in order is actually recommended by the JDBC javadoc for ResultSet for portability, so while it would be good to not have this restriction, I think it would be ok to require it in some form if it made a streaming/locator LOB implementation easier.
Great work with the thorough use-cases, Fernanda.
I suggest to go for an implementations where locators work on copies and not directly on the LOBs. Working directly on a LOB can change the behavior of current applications since updates may occur without a call to updateRow(). Hence, applications which sets parameters and then for some reason skips updateRow(), will experience a changed behavior. Also, it seems like the infrastructure for working on a copy will be needed anyway in order to temporarily store LOBs that have not yet been assigned to a row. With respect to the proposed stored procedures for accessing locators,
I think we will need separate routines for Blob and Clob with different in parameter types. Routines for Clobs will operate on VARCHAR parameters, while Blob routines will operate on VARBINARY/VARCHAR FOR BIT DATA. I propose that we create separate routines also where the parameter types does no vary. That way, we can get tid of the LocatorType parameter. Based on this, the following procedures are needed: Blob procedures: * BlobCreateLocator(Locator OUT INTEGER) -> Allocates a new Blob in a temporary space and returns its Locator * BlobReleaseLocator(Locator IN INTEGER) -> Releases a Blob. * BlobGetPosition(Locator IN INTEGER, SearchLocator IN INTEGER, SearchLiteral IN VARCHAR FOR BIT DATA, FromPosition IN INTEGER, LocatedAt OUT INTEGER) -> Returns the position in bytes of the first occurrence of SearchLocator or SearchLiteral within the Blob. * BlobGetLength(SourceLocator IN INTEGER, Length OUT INTEGER) -> Returns the length in bytes of the Blob * BlobGetBytes(SourceLocator IN INTEGER, FromPosition IN INTEGER, ForLength IN INTEGER, Bytes OUT VARCHAR FOR BIT DATA) -> Returns a byte array containing the bytes starting at FromPosition and with length ForLength * BlobSetBytes(SourceLocator IN INTEGER, FromPosition IN INTEGER, ForLength IN INTEGER, Bytes IN VARCHAR FOR BIT DATA) -> Replaces the bytes at position FromPosition with ForLength bytes from Bytes * BlobTruncate(SourceLocator IN INTEGER, Length IN INTEGER) -> Truncates the Blob to Length Clob procedures: * ClobCreateLocator(Locator OUT INTEGER) -> Allocates a new Clob in a temporary space and returns its Locator * ClobReleaseLocator(Locator IN INTEGER) -> Releases a Clob. * ClobGetPosition(Locator IN INTEGER, SearchLocator IN INTEGER, SearchLiteral IN VARCHAR, FromPosition IN INTEGER, LocatedAt OUT INTEGER) -> Returns the position in characters of the first occurrence of SearchLocator or SearchLiteral within the Clob. * ClobGetLength(SourceLocator IN INTEGER, Length OUT INTEGER) -> Returns the length in characters of the Clob * ClobGetSubString(SourceLocator IN INTEGER, FromPosition IN INTEGER, ForLength IN INTEGER, String OUT VARCHAR) -> Returns a string containing the characters starting at FromPosition and with length ForLength * ClobSetSubString(SourceLocator IN INTEGER, FromPosition IN INTEGER, ForLength IN INTEGER, String IN VARCHAR) -> Replaces the characters at position FromPosition with ForLength characters from String * ClobTruncate(SourceLocator IN INTEGER, Length IN INTEGER) -> Truncates the Clob to Length One may also consider using functions instead of procedures for those routines that have an OUT parameter. If let BlobGetBytes/ClobGetSubstring return Blob/Clob and avoid having to partition the Clob/Blob into fragments that fits into VARCHARS. Uploaded updated version of use cases for locator-based implementation. I have removed the use cases for updating LOBs directly since the implementation will work on copies.
Release note need to describe changes in behavior introduced by locator work.
Release note for the locator based implementation of the Network Client.
Updated version of the release note.
Scrub release note, removing cruft introduced by html editor. Closed unbalanced <li> tags.
More mods to the release note: this time the casing of the paragraph tags is changed from upper to lower.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1. MOTIVATION
In the current implementation of LOBs in the Derby Client Driver, the LOBs are materialized on the client size. In cases where large LOBS are being accesses, this materialization of the LOBs in the client side can cause an OutOfMemoryError.
2. APPROACH
The network client will request LOB columns to be sent as LOB Locators instead of data value bytes. Since the client no longer has a copy of the LOB data value bytes, operations on the LOB data will be performed on the server side, and the results of these operations will be sent to the client. The client will request the execution of an operation by the server by calling a stored procedure.
2.1 DRDA
The Output Override Descriptor (OUTOVR) allows the client to specify the output format for data to be returned as output to an SQL statement or query. This command data object must be sent with the command that returns the LOB data column as output (either a CNTQRY - Continue Query, or an EXCSQLSTT Execute SQL Statement commands). LOB columns are returned by default as data value bytes, in order to receive the LOB columns as LOB Locators, the network client must explicitly request it using the OUTOVR command data object.
2.2 JDBC Driver
The implementation of LOB related JDBC methods in the network client must be converted to use LOB Locators instead of having a copy of the LOB data in the client side. Since the client will no longer have the data value bytes for the LOB, operations performed on the data must be performed on the server side. This will be done using stored procedures. Each time a LOB related JDBC method is called, the client driver will "convert" that call into a call to one or more stored procedures. These are the necessary stored procedure to implement LOB related JDBC methods:
* CreateLocator(LocatorType IN SMALLINT, Locator OUT INTEGER)
-> Allocates a new LOB in a temporary space and returns its Locator
* ReleaseLocator(LocatorType IN SMALLINT, Locator IN INTEGER)
-> Releases a LOB.
* GetPosition(LocatorType IN SMALLINT, Locator IN INTEGER,
SearchLocator IN INTEGER, SearchLiteral IN STRING,
FromPosition IN INTEGER, LocatedAt OUT INTEGER)
-> Returns the position in bytes/characters of the first occurrence of
SearchLocator or SearchLiteral within the LOB.
* GetLength(LocatorType IN SMALLINT, SourceLocator IN INTEGER,
Length OUT INTEGER)
-> Returns the length in bytes/characters of the LOB
* GetSubString(LocatorType IN SMALLINT, SourceLocator IN INTEGER,
FromPosition IN INTEGER, ForLength IN INTEGER,
String OUT STRING)
-> Returns a byte array or string containing the bytes/characters
starting at FromPosition and with length ForLength
* SetString(LocatorType IN SMALLINT, SourceLocator IN INTEGER,
FromPosition IN INTEGER, ForLength IN INTEGER,
String IN STRING)
-> Replaces the bytes/characters at position FromPosition with
ForLength bytes/characters from String
* Truncate(LocatorType IN SMALLINT, SourceLocator IN INTEGER,
Length IN INTEGER)
-> Truncates the LOB to Length
3. COMPATIBILITY
3.1 Old server with new client
The new network client will always request to receive the LOB columns as LOB Locator values instead of the data values bytes. This request is done by sending an OUTOVR object describing the desired type with the CNTQRY command. The current implementation of the network server is able to parse these OUTOVR objects so receiving such an object from the client will not cause an error to happen. However, the network server does not do anything with the information received in the OUTOVR object, so it will continue to send the LOB columns as data value bytes instead of sending the LOB Locator.
In order to preserve compatibility between the current network server and a new network client, the new network client must be able to handle LOB data value bytes.
3.2 Old client with new server
The new server will be able to send LOB columns as LOB Locators instead of sending them as LOB data values bytes, if the network client requests it. The behavior of the old client is unchanged, so it will continue to request for the LOB columns to be sent as LOB data value bytes.
In order to preserve compatibility between the current network client and a new network server, the new network server must be able to return LOB columns both as LOB Locators and as LOB data values bytes. If the client request the LOB columns to be sent as LOB Locators, the server will return them as LOB Locators, otherwise it will return them as LOB data value bytes. This should also preserve the compatibility with the JCC client.