Derby
  1. Derby
  2. DERBY-208

Add support to retrieve lobs for Network Server by locator rather than matierializing the LOB

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.1.1.0
    • Fix Version/s: 10.3.1.4
    • Component/s: Network Server
    • Labels:
      None
    • Issue & fix info:
      Release Note Needed

      Description

      Currently Network Server materializes all LOB objects. LOB locator support is part of the DRDA spec so could be added to network server. See
      http://www.opengroup.org/dbiop/

      It is not clear if SQL Support for LOB locators would also be required.

      1. LOBLocator.pdf
        1.17 MB
        Fernanda Pizzorno
      2. LOBLocatorv2.pdf
        121 kB
        Øystein Grøvlen
      3. releaseNote.html
        5 kB
        Øystein Grøvlen
      4. releaseNote.html
        5 kB
        Øystein Grøvlen
      5. releaseNote.html
        5 kB
        Rick Hillegas
      6. releaseNote.html
        5 kB
        Rick Hillegas

        Issue Links

        1.
        Implementing the stored procedures called by the LOB related JDBC methods Sub-task Closed V.Narayanan
         
        2.
        Add code to support request and return of locators over DRDA Sub-task Closed Øystein Grøvlen
         
        3.
        Create framework for calling locator related stored procedures from client Sub-task Closed Øystein Grøvlen
         
        4.
        Implement Blob support for Locators Sub-task Closed Øystein Grøvlen
         
        5.
        Adding the locator information to FD:OCA descriptor (FDODSC) andFD:OCA data (FDODTA) of the SQLDTA objects Sub-task Closed V.Narayanan
         
        6.
        Client PreparedStatement/CallableStatement implementations need to use locators Sub-task Closed V.Narayanan
         
        7.
        Restructure code for Blob/Clob length in client to prepare for locator implementation Sub-task Closed Øystein Grøvlen
         
        8.
        BrokeredConnection needs to forward implementations of locator related methods in EngineConnection to the underlying physical connection Sub-task Closed V.Narayanan
         
        9.
        Connection.createClob() and Connection.createBlob() need to return locator support enabled LOB objects in the NetworkClient Sub-task Closed V.Narayanan
         
        10.
        Remove temporary code added to enable testing of CLOB locator related stored procedures. Sub-task Closed V.Narayanan
         
        11.
        Implement Clob support for locators Sub-task Closed V.Narayanan
         
        12.
        Fix javadoc issues related to locator work Sub-task Closed V.Narayanan
         
        13.
        Fix build problems related to locator work Sub-task Closed V.Narayanan
         
        14.
        Add locator support of soft upgrade to 10.3 Sub-task Closed Øystein Grøvlen
         
        15.
        Enable Clob locator support between NetworkServer and NetworkClient and modify tests that experience changed behaviour due to this(enabling Clob Locators). Sub-task Closed V.Narayanan
         

          Activity

          Hide
          Fernanda Pizzorno added a comment -

          I am investigating possible implementations LOBs using locators instead of materializing the LOB, and I would like to suggest the following approach.

          1. MOTIVATION

          In the current implementation of LOBs in the Derby Client Driver, the LOBs are materialized on the client size. In cases where large LOBS are being accesses, this materialization of the LOBs in the client side can cause an OutOfMemoryError.

          2. APPROACH

          The network client will request LOB columns to be sent as LOB Locators instead of data value bytes. Since the client no longer has a copy of the LOB data value bytes, operations on the LOB data will be performed on the server side, and the results of these operations will be sent to the client. The client will request the execution of an operation by the server by calling a stored procedure.

          2.1 DRDA

          The Output Override Descriptor (OUTOVR) allows the client to specify the output format for data to be returned as output to an SQL statement or query. This command data object must be sent with the command that returns the LOB data column as output (either a CNTQRY - Continue Query, or an EXCSQLSTT Execute SQL Statement commands). LOB columns are returned by default as data value bytes, in order to receive the LOB columns as LOB Locators, the network client must explicitly request it using the OUTOVR command data object.

          2.2 JDBC Driver

          The implementation of LOB related JDBC methods in the network client must be converted to use LOB Locators instead of having a copy of the LOB data in the client side. Since the client will no longer have the data value bytes for the LOB, operations performed on the data must be performed on the server side. This will be done using stored procedures. Each time a LOB related JDBC method is called, the client driver will "convert" that call into a call to one or more stored procedures. These are the necessary stored procedure to implement LOB related JDBC methods:

          • CreateLocator(LocatorType IN SMALLINT, Locator OUT INTEGER)
            -> Allocates a new LOB in a temporary space and returns its Locator
          • ReleaseLocator(LocatorType IN SMALLINT, Locator IN INTEGER)
            -> Releases a LOB.
          • GetPosition(LocatorType IN SMALLINT, Locator IN INTEGER,
            SearchLocator IN INTEGER, SearchLiteral IN STRING,
            FromPosition IN INTEGER, LocatedAt OUT INTEGER)
            -> Returns the position in bytes/characters of the first occurrence of
            SearchLocator or SearchLiteral within the LOB.
          • GetLength(LocatorType IN SMALLINT, SourceLocator IN INTEGER,
            Length OUT INTEGER)
            -> Returns the length in bytes/characters of the LOB
          • GetSubString(LocatorType IN SMALLINT, SourceLocator IN INTEGER,
            FromPosition IN INTEGER, ForLength IN INTEGER,
            String OUT STRING)
            -> Returns a byte array or string containing the bytes/characters
            starting at FromPosition and with length ForLength
          • SetString(LocatorType IN SMALLINT, SourceLocator IN INTEGER,
            FromPosition IN INTEGER, ForLength IN INTEGER,
            String IN STRING)
            -> Replaces the bytes/characters at position FromPosition with
            ForLength bytes/characters from String
          • Truncate(LocatorType IN SMALLINT, SourceLocator IN INTEGER,
            Length IN INTEGER)
            -> Truncates the LOB to Length

          3. COMPATIBILITY

          3.1 Old server with new client

          The new network client will always request to receive the LOB columns as LOB Locator values instead of the data values bytes. This request is done by sending an OUTOVR object describing the desired type with the CNTQRY command. The current implementation of the network server is able to parse these OUTOVR objects so receiving such an object from the client will not cause an error to happen. However, the network server does not do anything with the information received in the OUTOVR object, so it will continue to send the LOB columns as data value bytes instead of sending the LOB Locator.

          In order to preserve compatibility between the current network server and a new network client, the new network client must be able to handle LOB data value bytes.

          3.2 Old client with new server

          The new server will be able to send LOB columns as LOB Locators instead of sending them as LOB data values bytes, if the network client requests it. The behavior of the old client is unchanged, so it will continue to request for the LOB columns to be sent as LOB data value bytes.

          In order to preserve compatibility between the current network client and a new network server, the new network server must be able to return LOB columns both as LOB Locators and as LOB data values bytes. If the client request the LOB columns to be sent as LOB Locators, the server will return them as LOB Locators, otherwise it will return them as LOB data value bytes. This should also preserve the compatibility with the JCC client.

          Show
          Fernanda Pizzorno added a comment - I am investigating possible implementations LOBs using locators instead of materializing the LOB, and I would like to suggest the following approach. 1. MOTIVATION In the current implementation of LOBs in the Derby Client Driver, the LOBs are materialized on the client size. In cases where large LOBS are being accesses, this materialization of the LOBs in the client side can cause an OutOfMemoryError. 2. APPROACH The network client will request LOB columns to be sent as LOB Locators instead of data value bytes. Since the client no longer has a copy of the LOB data value bytes, operations on the LOB data will be performed on the server side, and the results of these operations will be sent to the client. The client will request the execution of an operation by the server by calling a stored procedure. 2.1 DRDA The Output Override Descriptor (OUTOVR) allows the client to specify the output format for data to be returned as output to an SQL statement or query. This command data object must be sent with the command that returns the LOB data column as output (either a CNTQRY - Continue Query, or an EXCSQLSTT Execute SQL Statement commands). LOB columns are returned by default as data value bytes, in order to receive the LOB columns as LOB Locators, the network client must explicitly request it using the OUTOVR command data object. 2.2 JDBC Driver The implementation of LOB related JDBC methods in the network client must be converted to use LOB Locators instead of having a copy of the LOB data in the client side. Since the client will no longer have the data value bytes for the LOB, operations performed on the data must be performed on the server side. This will be done using stored procedures. Each time a LOB related JDBC method is called, the client driver will "convert" that call into a call to one or more stored procedures. These are the necessary stored procedure to implement LOB related JDBC methods: CreateLocator(LocatorType IN SMALLINT, Locator OUT INTEGER) -> Allocates a new LOB in a temporary space and returns its Locator ReleaseLocator(LocatorType IN SMALLINT, Locator IN INTEGER) -> Releases a LOB. GetPosition(LocatorType IN SMALLINT, Locator IN INTEGER, SearchLocator IN INTEGER, SearchLiteral IN STRING, FromPosition IN INTEGER, LocatedAt OUT INTEGER) -> Returns the position in bytes/characters of the first occurrence of SearchLocator or SearchLiteral within the LOB. GetLength(LocatorType IN SMALLINT, SourceLocator IN INTEGER, Length OUT INTEGER) -> Returns the length in bytes/characters of the LOB GetSubString(LocatorType IN SMALLINT, SourceLocator IN INTEGER, FromPosition IN INTEGER, ForLength IN INTEGER, String OUT STRING) -> Returns a byte array or string containing the bytes/characters starting at FromPosition and with length ForLength SetString(LocatorType IN SMALLINT, SourceLocator IN INTEGER, FromPosition IN INTEGER, ForLength IN INTEGER, String IN STRING) -> Replaces the bytes/characters at position FromPosition with ForLength bytes/characters from String Truncate(LocatorType IN SMALLINT, SourceLocator IN INTEGER, Length IN INTEGER) -> Truncates the LOB to Length 3. COMPATIBILITY 3.1 Old server with new client The new network client will always request to receive the LOB columns as LOB Locator values instead of the data values bytes. This request is done by sending an OUTOVR object describing the desired type with the CNTQRY command. The current implementation of the network server is able to parse these OUTOVR objects so receiving such an object from the client will not cause an error to happen. However, the network server does not do anything with the information received in the OUTOVR object, so it will continue to send the LOB columns as data value bytes instead of sending the LOB Locator. In order to preserve compatibility between the current network server and a new network client, the new network client must be able to handle LOB data value bytes. 3.2 Old client with new server The new server will be able to send LOB columns as LOB Locators instead of sending them as LOB data values bytes, if the network client requests it. The behavior of the old client is unchanged, so it will continue to request for the LOB columns to be sent as LOB data value bytes. In order to preserve compatibility between the current network client and a new network server, the new network server must be able to return LOB columns both as LOB Locators and as LOB data values bytes. If the client request the LOB columns to be sent as LOB Locators, the server will return them as LOB Locators, otherwise it will return them as LOB data value bytes. This should also preserve the compatibility with the JCC client.
          Hide
          Bryan Pendleton added a comment -

          > The new network client will always request to receive the LOB columns as LOB Locator values instead of the data values bytes.

          Is it always desirable for the client to request locator handling of LOBs? It seems
          like you're trading off reduced/controlled memory handling (by using locators)
          versus fewer client/server network message exchanges (by the current technique
          of sending all the data from the server to the client in the initial data transfer).

          That is, would there be situations in which a client might prefer not to use the locator
          technique, for performance reasons?

          Show
          Bryan Pendleton added a comment - > The new network client will always request to receive the LOB columns as LOB Locator values instead of the data values bytes. Is it always desirable for the client to request locator handling of LOBs? It seems like you're trading off reduced/controlled memory handling (by using locators) versus fewer client/server network message exchanges (by the current technique of sending all the data from the server to the client in the initial data transfer). That is, would there be situations in which a client might prefer not to use the locator technique, for performance reasons?
          Hide
          Fernanda Pizzorno added a comment -

          > That is, would there be situations in which a client might prefer not to use the locator
          > technique, for performance reasons?

          Yes, when dealing with small LOBs for example, it may be preferable to send all the data from the server to the client in the initial data transfer instead of using the locator technique.

          Show
          Fernanda Pizzorno added a comment - > That is, would there be situations in which a client might prefer not to use the locator > technique, for performance reasons? Yes, when dealing with small LOBs for example, it may be preferable to send all the data from the server to the client in the initial data transfer instead of using the locator technique.
          Hide
          Fernanda Pizzorno added a comment -

          The attached file (LOBLocator.pdf) contains a few use cases of the implementation I proposed.

          Show
          Fernanda Pizzorno added a comment - The attached file (LOBLocator.pdf) contains a few use cases of the implementation I proposed.
          Hide
          Bryan Pendleton added a comment -

          I came across this interesting bug report from another DBMS:
          http://support.microsoft.com/kb/824106

          What steps should our implementation take to ensure that we don't
          have such a problem?

          Show
          Bryan Pendleton added a comment - I came across this interesting bug report from another DBMS: http://support.microsoft.com/kb/824106 What steps should our implementation take to ensure that we don't have such a problem?
          Hide
          Daniel John Debrunner added a comment -

          Reading the result columns in order is actually recommended by the JDBC javadoc for ResultSet for portability, so while it would be good to not have this restriction, I think it would be ok to require it in some form if it made a streaming/locator LOB implementation easier.

          Show
          Daniel John Debrunner added a comment - Reading the result columns in order is actually recommended by the JDBC javadoc for ResultSet for portability, so while it would be good to not have this restriction, I think it would be ok to require it in some form if it made a streaming/locator LOB implementation easier.
          Hide
          Øystein Grøvlen added a comment -

          Great work with the thorough use-cases, Fernanda.

          I suggest to go for an implementations where locators work on copies and not directly on the LOBs. Working directly on a LOB can change the behavior of current applications since updates may occur without a call to updateRow(). Hence, applications which sets parameters and then for some reason skips updateRow(), will experience a changed behavior. Also, it seems like the infrastructure for working on a copy will be needed anyway in order to temporarily store LOBs that have not yet been assigned to a row.

          Show
          Øystein Grøvlen added a comment - Great work with the thorough use-cases, Fernanda. I suggest to go for an implementations where locators work on copies and not directly on the LOBs. Working directly on a LOB can change the behavior of current applications since updates may occur without a call to updateRow(). Hence, applications which sets parameters and then for some reason skips updateRow(), will experience a changed behavior. Also, it seems like the infrastructure for working on a copy will be needed anyway in order to temporarily store LOBs that have not yet been assigned to a row.
          Hide
          Øystein Grøvlen added a comment -

          With respect to the proposed stored procedures for accessing locators,
          I think we will need separate routines for Blob and Clob with
          different in parameter types. Routines for Clobs will operate on
          VARCHAR parameters, while Blob routines will operate on
          VARBINARY/VARCHAR FOR BIT DATA. I propose that we create separate
          routines also where the parameter types does no vary. That way, we
          can get tid of the LocatorType parameter. Based on this, the
          following procedures are needed:

          Blob procedures:

          • BlobCreateLocator(Locator OUT INTEGER)
            -> Allocates a new Blob in a temporary space and returns its Locator
          • BlobReleaseLocator(Locator IN INTEGER)
            -> Releases a Blob.
          • BlobGetPosition(Locator IN INTEGER,
            SearchLocator IN INTEGER,
            SearchLiteral IN VARCHAR FOR BIT DATA,
            FromPosition IN INTEGER,
            LocatedAt OUT INTEGER)
            -> Returns the position in bytes of the first occurrence of
            SearchLocator or SearchLiteral within the Blob.
          • BlobGetLength(SourceLocator IN INTEGER,
            Length OUT INTEGER)
            -> Returns the length in bytes of the Blob
          • BlobGetBytes(SourceLocator IN INTEGER,
            FromPosition IN INTEGER,
            ForLength IN INTEGER,
            Bytes OUT VARCHAR FOR BIT DATA)
            -> Returns a byte array containing the bytes starting at
            FromPosition and with length ForLength
          • BlobSetBytes(SourceLocator IN INTEGER,
            FromPosition IN INTEGER,
            ForLength IN INTEGER,
            Bytes IN VARCHAR FOR BIT DATA)
            -> Replaces the bytes at position FromPosition with
            ForLength bytes from Bytes
          • BlobTruncate(SourceLocator IN INTEGER,
            Length IN INTEGER)
            -> Truncates the Blob to Length

          Clob procedures:

          • ClobCreateLocator(Locator OUT INTEGER)
            -> Allocates a new Clob in a temporary space and returns its Locator
          • ClobReleaseLocator(Locator IN INTEGER)
            -> Releases a Clob.
          • ClobGetPosition(Locator IN INTEGER,
            SearchLocator IN INTEGER,
            SearchLiteral IN VARCHAR,
            FromPosition IN INTEGER,
            LocatedAt OUT INTEGER)
            -> Returns the position in characters of the first occurrence of
            SearchLocator or SearchLiteral within the Clob.
          • ClobGetLength(SourceLocator IN INTEGER,
            Length OUT INTEGER)
            -> Returns the length in characters of the Clob
          • ClobGetSubString(SourceLocator IN INTEGER,
            FromPosition IN INTEGER,
            ForLength IN INTEGER,
            String OUT VARCHAR)
            -> Returns a string containing the characters
            starting at FromPosition and with length ForLength
          • ClobSetSubString(SourceLocator IN INTEGER,
            FromPosition IN INTEGER,
            ForLength IN INTEGER,
            String IN VARCHAR)
            -> Replaces the characters at position FromPosition with
            ForLength characters from String
          • ClobTruncate(SourceLocator IN INTEGER,
            Length IN INTEGER)
            -> Truncates the Clob to Length

          One may also consider using functions instead of procedures for those
          routines that have an OUT parameter. If DERBY-2201 is fixed, we could
          let BlobGetBytes/ClobGetSubstring return Blob/Clob and avoid having to
          partition the Clob/Blob into fragments that fits into VARCHARS.

          Show
          Øystein Grøvlen added a comment - With respect to the proposed stored procedures for accessing locators, I think we will need separate routines for Blob and Clob with different in parameter types. Routines for Clobs will operate on VARCHAR parameters, while Blob routines will operate on VARBINARY/VARCHAR FOR BIT DATA. I propose that we create separate routines also where the parameter types does no vary. That way, we can get tid of the LocatorType parameter. Based on this, the following procedures are needed: Blob procedures: BlobCreateLocator(Locator OUT INTEGER) -> Allocates a new Blob in a temporary space and returns its Locator BlobReleaseLocator(Locator IN INTEGER) -> Releases a Blob. BlobGetPosition(Locator IN INTEGER, SearchLocator IN INTEGER, SearchLiteral IN VARCHAR FOR BIT DATA, FromPosition IN INTEGER, LocatedAt OUT INTEGER) -> Returns the position in bytes of the first occurrence of SearchLocator or SearchLiteral within the Blob. BlobGetLength(SourceLocator IN INTEGER, Length OUT INTEGER) -> Returns the length in bytes of the Blob BlobGetBytes(SourceLocator IN INTEGER, FromPosition IN INTEGER, ForLength IN INTEGER, Bytes OUT VARCHAR FOR BIT DATA) -> Returns a byte array containing the bytes starting at FromPosition and with length ForLength BlobSetBytes(SourceLocator IN INTEGER, FromPosition IN INTEGER, ForLength IN INTEGER, Bytes IN VARCHAR FOR BIT DATA) -> Replaces the bytes at position FromPosition with ForLength bytes from Bytes BlobTruncate(SourceLocator IN INTEGER, Length IN INTEGER) -> Truncates the Blob to Length Clob procedures: ClobCreateLocator(Locator OUT INTEGER) -> Allocates a new Clob in a temporary space and returns its Locator ClobReleaseLocator(Locator IN INTEGER) -> Releases a Clob. ClobGetPosition(Locator IN INTEGER, SearchLocator IN INTEGER, SearchLiteral IN VARCHAR, FromPosition IN INTEGER, LocatedAt OUT INTEGER) -> Returns the position in characters of the first occurrence of SearchLocator or SearchLiteral within the Clob. ClobGetLength(SourceLocator IN INTEGER, Length OUT INTEGER) -> Returns the length in characters of the Clob ClobGetSubString(SourceLocator IN INTEGER, FromPosition IN INTEGER, ForLength IN INTEGER, String OUT VARCHAR) -> Returns a string containing the characters starting at FromPosition and with length ForLength ClobSetSubString(SourceLocator IN INTEGER, FromPosition IN INTEGER, ForLength IN INTEGER, String IN VARCHAR) -> Replaces the characters at position FromPosition with ForLength characters from String ClobTruncate(SourceLocator IN INTEGER, Length IN INTEGER) -> Truncates the Clob to Length One may also consider using functions instead of procedures for those routines that have an OUT parameter. If DERBY-2201 is fixed, we could let BlobGetBytes/ClobGetSubstring return Blob/Clob and avoid having to partition the Clob/Blob into fragments that fits into VARCHARS.
          Hide
          Øystein Grøvlen added a comment -

          Uploaded updated version of use cases for locator-based implementation. I have removed the use cases for updating LOBs directly since the implementation will work on copies.

          Show
          Øystein Grøvlen added a comment - Uploaded updated version of use cases for locator-based implementation. I have removed the use cases for updating LOBs directly since the implementation will work on copies.
          Hide
          Øystein Grøvlen added a comment -

          Release note need to describe changes in behavior introduced by locator work.

          Show
          Øystein Grøvlen added a comment - Release note need to describe changes in behavior introduced by locator work.
          Hide
          Øystein Grøvlen added a comment -

          Release note for the locator based implementation of the Network Client.

          Show
          Øystein Grøvlen added a comment - Release note for the locator based implementation of the Network Client.
          Hide
          Øystein Grøvlen added a comment -

          Updated version of the release note.

          Show
          Øystein Grøvlen added a comment - Updated version of the release note.
          Hide
          Rick Hillegas added a comment -

          Scrub release note, removing cruft introduced by html editor. Closed unbalanced <li> tags.

          Show
          Rick Hillegas added a comment - Scrub release note, removing cruft introduced by html editor. Closed unbalanced <li> tags.
          Hide
          Rick Hillegas added a comment -

          More mods to the release note: this time the casing of the paragraph tags is changed from upper to lower.

          Show
          Rick Hillegas added a comment - More mods to the release note: this time the casing of the paragraph tags is changed from upper to lower.

            People

            • Assignee:
              Unassigned
              Reporter:
              Kathey Marsden
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development