|
hmmm.. using the error.java attached to the original email thread I get the IndexOutOfBoundsException.
performing dump. java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.RangeCheck(ArrayList.java:572) at java.util.ArrayList.get(ArrayList.java:347) at org.apache.derby.client.net.NetCursor.findExtdtaData(NetCursor.java:1049) at org.apache.derby.client.net.NetCursor.getClobColumn_(NetCursor.java:1122) at org.apache.derby.client.am.Cursor.getClob(Cursor.java:1187) at org.apache.derby.client.am.ResultSet.getClob(ResultSet.java:1259) at error.getLob(error.java:97) at error.dump_db(error.java:161) at error.main(error.java:172) This regression was caused by the following change:
http://svn.apache.org/viewvc?view=rev&revision=544481 Contributed by V.Narayanan It was a small patch but enabled code from other Jira issues I think. At the root of this issue is this function in EmbedConnection. We seem to increment this value for every lob value on the connection and then send it as a signed int which overflows at 32768, causing the high bit to be set and the client to think that it is getting a lob value instead of locator.
/** * Return the current locator value * @return an integer that represents the most recent locator value. */ private int getIncLOBKey() { return ++rootConnection.lobHMKey ; } I'd appreciate input from the lob locator folks on what is the best course of action at this point. Should we switch to sending by value when we reach this limit? As an aside, one thing I noticed is that lobHMKey actually never gets reset, even with EmbedConnection.clearLOBMapping() . That would not help in this case however as we don't get a commit. Another thing I noticed when looking at this bug is that each lob is added to the HashMap twice. Once as a side effect of getObject and then again with a call to:
int locator = database.getConnection().addLOBMapping(val); so the repro fails on row 16384 instead of 32768. Attached is a patch to fix the double entry of lobs in the lob hashmap for network server. We now store the locator in the lob so network server can retrieve it instead of doing a second store of the lob.
The test case still fails but retrieves double the number of rows before failing, so it does not fix the core bug. I just thought I would fix this while it was in front of me. I didn't add a functional test. I wasn't sure how appropriate it was to add a test with a table with 32K rows to the functional tests. The patch also resets the lob key on commit/rollback which will help prevent this bug from hitting upon retrieving 32K lobs even across transaction boundaries. Does the DRDA spec require that locators are two-byte integers, or could we use four bytes? Seems like we at least should make EmbedConnection.lobHMKey wrap around on the max value we are able to send over DRDA, and ensure that it doesn't go negative.
I don't think resetting the lob key on commit/rollback works, as the client does not expect the keys to be reused. Old Clob/Blob objects on the client will then start returning data for the queries performed in the new transaction. I ran the code below to test it. -----8<----- import java.sql.*; public class ResetCounter { public static void main(String[] args) throws Exception { Class.forName("org.apache.derby.jdbc.ClientDriver"); Connection c = DriverManager.getConnection( "jdbc:derby://localhost/db;create=true"); Statement s1 = c.createStatement(); ResultSet rs1 = s1.executeQuery("values cast('first' as clob)"); rs1.next(); Clob first = rs1.getClob(1); rs1.close(); // auto-commit happens here Statement s2 = c.createStatement(); ResultSet rs2 = s2.executeQuery("values cast('second' as clob)"); rs2.next(); Clob second = rs2.getClob(1); System.out.println("1: " + first.getSubString(1, 100)); // should fail System.out.println("2: " + second.getSubString(1, 100)); rs2.close(); } } ----->8----- Without the patch, first.getSubString(1, 100) failed (correctly) because the transaction in which the Clob was created, had been committed. With the patch applied, it instead returned the data that the second transaction had retrieved. Thanks Knut for looking at the patch and catching that resetting the lob key on commit/rollback doesn't work. I will add your test to the regression test so we have a test for that. It seems to me that we would have the same problem (albeit further removed) if we roll over the lob key at 32K. I will look at the DRDA spec and see if there is an opportunity to send a larger value.
I wonder if you had a chance to look at the rest of the patch to avoid storing the locator twice for each lob. Does that change look ok? > It seems to me that we would have the same problem (albeit further removed) if we roll over the lob key at 32K. I will look at the DRDA spec and see if there is an opportunity to send a larger value.
Yes, we would have the same problem. If we find a way to use e.g. 4-byte integers it would only manifest itself if someone retrieves a lob, keeps it while retrieving billions of other lobs in the same connection, and then tries to access the first lob. In that case, I think it's highly unlikely that someone will be bitten by it. I think the rest of the patch looks OK (except there's a typo in the javadoc for EmbedClob.getLocator(): loctor -> locator, and in EngineBlob/EngineClob: getLobLobMapping() -> getLOBMapping()). Do you think we should use getLocator() instead of addLOBMapping() in LobStoredProcedure.BLOBCREATELOCATOR() and CLOBCREATELOCATOR() as well? Seems like the Blobs/Clobs they create already have called addLOBMapping() in their constructors. Another thing I came to think about: Would it be better to remove addLOBMapping() from the constructors in EmbedBlob/EmbedClob and instead call it lazily from getLocator()? Something like: public int getLocator() { if (locator == -1) { locator = getEmbedConnection().addLOBMapping(this); } return locator; } Then we'd also remove the overhead of maintaining the lob mapping in embedded mode. I have not time to look at the details of this problem today, but I
thought I would write up a few things based on what I remember from reviewing this work. I am not sure if it is a good idea, but the locator mapping table is used for two separate issues: 1. It used by the network server to store the mapping from locator by client to EmbedBlob/EmbedClob. This is used by stored procedures that is called by the client for LOB operations. 2. It is used by the embedded driver to store references to LOB objects that due to their size temporarily spills to disk. This is done so that the temporary files can be cleaned up on commit/rollback. To me it sounds like the item 2. is implemented so that all LOB objects are stored in the mapping table, not just those that need clean-up. (I am guessing, I have not checked the code). That would make two entries for every client LOB. With respect to the wrapping locator generator, I think it should be possible to reuse locators that have been released. Hence, one way to solve this is to start at 0 when it wraps, and iterate until an unused locator is found. Oystein
>With respect to the wrapping locator generator, I think it should be >possible to reuse locators that have been released. Hence, one way to >solve this is to start at 0 when it wraps, and iterate until an unused >locator is found. Still the test case attached to this issue would fail as there are no calls to Clob.free() since it appears to be written against jdk1.5. The core problem is that we are limitted to 32K entries. I was looking at the spec at section 5.6.5.13 and if I read that correctly (which perhaps I am not) we are limitted to 4 bytes for the locator. I am not sure how to get around this hard limit. > I was looking at the spec at section 5.6.5.13 and if I read that correctly (which perhaps I am not) we are limitted to 4 bytes for the locator. I am not sure how to get around this hard limit.
I think you read it correctly (section 5.6.5 says bits 10-11 in lower box represent maximum length in bytes, not including length field or null indicator, and section 5.6.5.13 has the value 4 in bits 10-11). Note that with 4 bytes we are limited to ~4 billion entries (or ~2 billion if the sign bit is stripped off), not 32K. It's just that the current implementation only uses two bytes, hence the 32K limit. Here is the patch to fix the double hashmap entries without resetting the lob key. I plan to commit this tomorrow if I don't hear back.
Kathey So we do send and receive a 4 byte int, but we have this check in the locator(int column) method of NetCursor which says the highest bit will be set if it is not a locator, but it is not the highest bit that we are checking, so we end up using only two bytes for our locator.
Perhaps in the case of lob value we are reading the extended length indicator which will be 0x800x. I'm not sure if that's what's going on though. I did try against a 10.2 server to see if the check is necessary for sending lobs by value and found that indeed it is. Removing the check in 10.4 and just returning the locator gets us through the repro. It seems like maybe instead of this check we should be able to tell by the drda type whether we have a locator or a value. private int locator(int column) { int locator = get_INTEGER(column); // If Lob value was sent instead of locator, highest bit will be set // Zero is not a valid locator, it indicates a zero length value if (((locator & 0x8000) == 0x8000) || (locator == 0)) { return Lob.INVALID_LOCATOR; } else { return locator; } } The new patch looks fine, but it still has the typos I mentioned in an earlier comment. It should also change LOBStoredProcedure.BLOBCREATELOCATOR and LOBStoredProcedure.CLOBCREATELOCATOR so that they use getLocator() instead of addLOBMapping().
I think this change also means that addLOBMapping() should be removed from EngineConnection and BrokeredConnection, and made package private in the EmbedConnection class. It would also be good to make the locator field in EmbedBlob/EmbedClob final, so that we check on compile-time that it hasn't been forgotten in one of the constructors. I think part of the confusion here is that EmbedConnection#lobHashMap
is used for two separate purposes. It was first added as a mechanism for the NetworkServer to add locators mapping for locators that were sent to the client. This mechanism was intented to by driven by the NetworkServer, but EmbedConnection is used for storage since it is persistent across client requests. Later, the same hash map was used to store references to all internal Lobs in order to be able to do clean-up at end of transaction. (I was wrong when I earlier wrote that this was really only needed for LOB objects that had associated temporary files. This mechanism is used in order to invalidate any active LOB object at the end of transaction.) There are other ways to achieve invalidation at end of transaction, but as long as we are using the current mechanism, it will be necessary to keep track of all active LOB objects of a connection. Given that, it is not necessary for the network server to do its own book-keeping. Instead, it could rely on the embedded driver for this. I think Kathey's patch is a step in the right direction here, but as Knut Anders suggest I think we should go a step further and remove all traces of how this is currently done by the network server. I also think that it would be better if this clean-up was tied to on the DRDA problems that causes the reported bug. (I will discuss that issue in a separate comment.) The reasoning for the check in NetCursor#locator was to be able to
both handle pre-10.3 servers, and 10.3 servers running in soft-upgrade mode, both which ignore requests for use of locators. I guess the first group could be handled by recognizing that the server is pre-10.3, and it should also be possible to come up with a separate mechanism to handle soft upgrade. However, It would be even better if one where able to communicate the type of each value dynamically, but I were not able to determine how to do this with DRDA. Does anybody know? I think that when I added the code that checked whether a locator or LOB had been sent, I thought that the when a LOB was sent, its column position would contain the length of the LOB with the high-order bit set. However, it seems that what has originally been sent is not the length of the LOB, but the length of the length field for the LOB, and it is not the high-order bit that is set, but the high-order bit of byte 3. When Layer B streaming was added, it seems the length of the length field was fixed at 4 bytes so that now the client will always receive 0x8004 for non-zero length LOBs. I do not know whether the above is in accordance with the DRDA spec, and it also seems that the client expect the high-order bit to be set when the length is unknown. From NetCursor#isNonTrivialDataLob: // if the high-order bit is set, length is unknown -> set value to x'FF..FF' if (((dataBuffer_[position]) & 0x80) == 0x80) { length = -1; } else { A quick, but dirty, way to fix the reported bug would be to make the server skip 0x8004 when allocating locators, and make the client check for this exact value when determining whether it has received a locator or not. Oystein asked:
>However, It would be even better if >one where able to communicate the type of each value dynamically, but >I were not able to determine how to do this with DRDA. Does anybody know? I think that the sqlType that we send with the SQLDAGRP should be set to one of the following when we send by locator. public static final int DB2_SQLTYPE_BLOB_LOCATOR = 960; // BLOB locator public static final int DB2_SQLTYPE_NBLOB_LOCATOR = 961; public static final int DB2_SQLTYPE_CLOB_LOCATOR = 964; // CLOB locator public static final int DB2_SQLTYPE_NCLOB_LOCATOR = 965; Right now we always send one of these whether it is by locator or value. public static final int DB2_SQLTYPE_BLOB = 404; // BLOB public static final int DB2_SQLTYPE_NBLOB = 405; public static final int DB2_SQLTYPE_CLOB = 408; // CLOB public static final int DB2_SQLTYPE_NCLOB = 409; If we send the specific SQL Type then the client should be able to branch its logic based on that. There are issues of backward compatibility which would have to be dealt with, but it seems a better way for the client to figure out whether it is dealing with a locator or a value. Another long term drda solution might be Dynamic data format, described in section 7.8 of the drda manual.
https://issues.apache.org/jira/browse/DERBY-3353 Either this solution or branching based on the sqlType would mean version specific drda handling. So I think I might pursue Oystein's suggestion for a quick fix and then more long term solutions can be pursued on the trunk. Attached is a patch which implements Oystein's suggestion to exclude the extended length value from being a valid locator value. I had to exclude 0x8000, 0x8002, 0x8004, and 0x8008 as these are all possible valid values with 10.1.
I tried the repro with the patch with various version combinations of 10.1, 10.2 and trunk with the patch. I am running tests now. For completeness, you may also want to skip all negative values and zero (the client only accepts positive locators). This only happens if you wrap around at ~2 billion, but with long-lived connections (for instance in a connection pool) it's possible. We're only talking about 68 lobs a second continuously for one year... ;)
It would be good to expand the comment in getIncLOBKey() since this is quite ugly hack, and it won't be obvious to a reader why we skip those values. I propose something along these lines:
// Skip 0x8000, 0x8002, 0x8004, 0x8006, for // Earlier versions of the Derby Network Server (<10.3) didn't // support locators and would send an extended length field // with one of the above mentioned values instead of a // locator, even when locators were requested. To enable the // client driver to detect that locators aren't supported, // we don't use any of them as locator values. Attached is derby-3243_diff.txt addressing Knut's comments to skip negative and 0 locators.
This is marked 'existing application impact' but I don't see any reason why. Does the solution require some change to existing applications?
Unchecking Existing Application Impact. This fix does not require any action or have impact on existing applications.
Hello Daniel,
thank you for all the effort. I have opened the request when my application "outseeker" (copies emails from MS Outlook into Derby) stopped working beyond a certain message volume. I was able to "fix" the problem by moving to the embedded version, thereby reducing the functionality (location independence) of my application. Similarly, I found "critical" to be a strong attribute, but reading through the definitions convinced me that it fit the problem. I apologise if I did make a classification mistake. Best regards, Oliver Seidel https://issues.apache.org/jira/browse/DERBY-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565825#action_12565825 reason why. "ResultSet" of -------------------------------------------------------------------------------------------- https://issues.apache.org/jira/browse/DERBY-3243 ------------------ 10.3.1.4 - is where case by messages clob). This is a demonstration Exception", but a diagnosis that data. -------------------------------------------------------------------------------- ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------------------------------------------ java.io.StringWriter(); java.io.PrintWriter(sw); ------------------------------------------------------------------------ org.apache.derby.drda.NetworkServerControl(java.net.InetAddress.getByName(host),port); "+Integer.toString(port)+"."); ------------------------------------------------------------------------ ---------------------------------------------------------------------- }; ---------------------------------------------------------------------- 0; }; 0; }; return null; return -1; }; -1; }; myclob(x); }; ------------------------------------------------------------------------ String ------------------------------------------------------------------------ x.replaceAll("[\0\r\\\\]","").replaceAll("'","\\\"").replaceAll(",+",","); ------------------------------------------------------------------------ (Math.pow(10.0,digits*Math.random())); ------------------------------------------------------------------------ ------------------------------------------------------------------------ varchar(100), ?, ? ------------------------------------------------------------------------ from ------------------------------------------------------------------------ ------------------------------------------------------------------------ |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
java.sql.SQLException: Insufficient data while reading from the network - expected a minimum of 6 bytes and received on
y -1 bytes. The connection has been terminated.
at org.apache.derby.client.am.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:46)
at org.apache.derby.client.am.SqlException.getSQLException(SqlException.java:362)
at org.apache.derby.client.am.PreparedStatement.executeUpdate(PreparedStatement.java:391)
at error.fill_db(error.java:147)
at error.main(error.java:171)
Caused by: org.apache.derby.client.am.DisconnectException: Insufficient data while reading from the network - expected
minimum of 6 bytes and received only -1 bytes. The connection has been terminated.
at org.apache.derby.client.net.Reply.fill(Reply.java:195)
at org.apache.derby.client.net.Reply.ensureALayerDataInBuffer(Reply.java:215)
at org.apache.derby.client.net.Reply.readDssHeader(Reply.java:317)
at org.apache.derby.client.net.Reply.startSameIdChainParse(Reply.java:1147)
at org.apache.derby.client.net.NetStatementReply.readExecute(NetStatementReply.java:69)
at org.apache.derby.client.net.StatementReply.readExecute(StatementReply.java:55)
at org.apache.derby.client.net.NetPreparedStatement.readExecute_(NetPreparedStatement.java:183)
at org.apache.derby.client.am.PreparedStatement.readExecute(PreparedStatement.java:1796)
at org.apache.derby.client.am.PreparedStatement.flowExecute(PreparedStatement.java:2116)
at org.apache.derby.client.am.PreparedStatement.executeUpdateX(PreparedStatement.java:396)
at org.apache.derby.client.am.PreparedStatement.executeUpdate(PreparedStatement.java:382)
... 2 more
With 10.2.2.0 the program runs cleanly. Marking as a regression.