Issue Details (XML | Word | Printable)

Key: DERBY-2493
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Knut Anders Hatlen
Reporter: Knut Anders Hatlen
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

Use unsynchronized collections in BackingStoreHashtable

Created: 27/Mar/07 07:23 PM   Updated: 29/Jun/09 02:11 PM
Return to search
Component/s: Store
Affects Version/s: 10.3.1.4
Fix Version/s: 10.3.1.4

Time Tracking:
Not Specified

File Attachments:
  Size
File Licensed for inclusion in ASF works derby-2493-aggregate.diff 2007-03-29 12:51 PM Knut Anders Hatlen 7 kB
Text File Licensed for inclusion in ASF works derby-2493-aggregate.stat 2007-03-29 12:51 PM Knut Anders Hatlen 0.2 kB
File Licensed for inclusion in ASF works derby-2493-assert.diff 2007-03-29 08:29 AM Knut Anders Hatlen 6 kB
Text File Licensed for inclusion in ASF works derby-2493-assert.stat 2007-03-29 08:29 AM Knut Anders Hatlen 0.3 kB
File Licensed for inclusion in ASF works derby-2493-hashtable.diff 2007-04-10 03:11 PM Knut Anders Hatlen 11 kB
File Licensed for inclusion in ASF works derby-2493-vector.diff 2007-04-12 12:19 PM Knut Anders Hatlen 14 kB
File Licensed for inclusion in ASF works derby-2493-vector.stat 2007-04-12 12:19 PM Knut Anders Hatlen 0.4 kB

Bug behavior facts: Performance
Resolution Date: 13/Apr/07 07:21 AM


 Description  « Hide
BackingStoreHashtable uses a Vector and a Hashtable, but doesn't need the synchronization provided by these classes (I think). Replacing them with ArrayList and HashMap could improve performance for some kinds of operations.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Knut Anders Hatlen added a comment - 28/Mar/07 12:57 PM
Replacing the Hashtable with a HashMap was trivial, but because of differences between the implementations of Hashtable and HashMap, the rows come out in different order for some queries (in particular SELECT DISTINCT). And since the implementation of HashMap changed between JDK5 and JDK6, the order varies between JVMs as well. Three tests will be affected by this change:
  JUnit:
    LangScripts.aggregate
    ResultSetsFromPreparedStatementTest.testDistinctScanResultSet
  derbyall:
    lang/distinct.sql (currently being converted to JUnit, see DERBY-2491)

I don't think the LangScripts tests can handle different results in different JVMs, so that one will probably have to be converted to pure JUnit in order to work with this change. In ResultSetsFromPreparedStatementTest it should be possible to use an assert method which didn't require a specific order.

Knut Anders Hatlen added a comment - 28/Mar/07 01:06 PM
It seems like the ordering is already a problem in lang/distinct.sql. That's why it has different canons for different JVMs.

Andrew McIntyre added a comment - 28/Mar/07 04:55 PM
Thanks for the review, Knut! I appreciate it.

checkDistinctRows does only check the number of rows returned, and it is some cause for concern. Theoretically, I suppose a change could cause a row to be eliminated from one set of duplicates and then not removed from another for some reason, and the number of rows as checked by assertRowCount() would be the same, masking an actual failure. But because the data and queries here are so simple, I think it is more likely that duplicates would just fail to be eliminated, and such a change would cause some testcases to fail with higher row counts. So, while some detail has now been lost in the translation to JUnit, I think confidence in the test lies in its simple data sets and in the redundancy of the queries run over them.

As for additional checks in the new test, the only thing that immediately came to mind was to add a check into checkDistinctRows/assertRowCount that some key row in each query has returned only unique values. However, the row that is 'key' for the select * from ... queries changes depending on the data set, so it might be hard to nail down a generic solution that provides a good, simple assert method.

There are several tests that assert full result sets in the new test, but it might be worthwhile to add some tests that select distinct on one column, and then order by another column so that differences in row order are eliminated, and then assert the full result set in the test. If you have any ideas or time for that, feel free to add them to the new JUnit test once it is committed (which will be very shortly).

I'll correct the two items as regards checkDistinctRows(). After having some time away from the test, I'm thinking a better approach for RuntimeStatisticsParser is to let it do the parsing. :-) i.e. instead of:

assertEquals(-1, rtsp.indexOf("Distinct Scan"));
assertTrue(rtsp.indexOf("Eliminate duplicates = true") > 0);

in the test, have RuntimeStatisticsParser make the proper String calls and store the results as private instance variables whose value can be asserted using getter methods. The above would then become:

assertFalse(rtsp.usedDistinctScan());
assertTrue(rtsp.eliminatedDuplicates());

which I think would give the test better readability and make RuntimeStatisticsParser more useful.

Let me know if you have any other comments, otherwise I'll check the test in once I've made the changes mentioned above.

Knut Anders Hatlen added a comment - 29/Mar/07 08:29 AM
Attaching patch (derby-2493-assert) which creates a new assert method JDBC.assertUnorderedResultSet() which is similar to assertFullResultSet() only that it doesn't care about the order of the result set. Also made two test cases (one in DistinctTest and one in ResultSetsFromPreparedStatementTest) use the new method to prevent them from failing when the Hashtable in BackingStoreHashtable is replaced with a HashMap.

Knut Anders Hatlen added a comment - 29/Mar/07 08:34 AM
Committed the test changes with revision 523621.

Knut Anders Hatlen added a comment - 29/Mar/07 12:51 PM
The only part of lang/aggregate.sql that is sensitive to the Hashtable changes, is the repro for DERBY-280. Attaching a patch (derby-2493-aggregate) which moves that part into lang.GroupByTest. The new JUnit test case uses assertUnorderedResultSet to avoid the problem with different row order (except for one query which already specifies an explicit ORDER BY).

Knut Anders Hatlen added a comment - 29/Mar/07 12:59 PM
Committed derby-2493-aggregate with revision 523691.

Bryan Pendleton added a comment - 29/Mar/07 02:02 PM
I read (can't find the reference, sorry) that switching to use of unsynchronized
collection classes appears to provide substantially less benefit in JDK 1.5/1.6
than it did in, say, 1.3.

What is your experience? Does switching to the unsynchronized class provide
a substantial benefit in a 1.5 or 1.6 environment? Is there any easy comparison
of the magnitude of the difference between, say, 1.4 and 1.6 environments?

Knut Anders Hatlen added a comment - 29/Mar/07 02:40 PM
In my experience, things have become much better, especially in 1.6. Since the monitors in this issue are uncontended, I wouldn't expect any significant performance improvement on 1.6 (but I still think it's good to get rid of unnecessary synchronization). For contended monitors, the benefit would be greater even on 1.6. For instance, Olav's nightly performance tests (http://home.online.no/~olmsan/derby/perf/) show significantly improved performance for many of the multi-user tests between March 17 and March 18. That improvement was caused by Dyre's patch to DERBY-2114, which reduced some double synchronization by replacing a Hashtable with a HashMap in Clock. Similar effects were seen when replacing Hashtables with HashMaps in the lock manager.

I haven't done much testing with 1.4 lately, but there are some graphs attached to DERBY-2327 which show the difference between 1.5 and 1.6, comparing an approach using a HashMap inside a synchronized block with an approach using a ConcurrentHashMap. The difference between the two approaches is much greater on 1.5 than on 1.6, mainly because synchronization has become cheaper.

Knut Anders Hatlen added a comment - 10/Apr/07 03:11 PM
derby-2493-hashtable.diff replaces the Hashtable in BackingStoreHashtable with a HashMap. Derbyall and suites.All ran successfully (except the tests that also fail in the Tinderbox).

Knut Anders Hatlen added a comment - 11/Apr/07 07:21 AM
Committed derby-2493-hashtable.diff with revision 527402.

Knut Anders Hatlen added a comment - 12/Apr/07 12:19 PM
derby-2493-vector.diff replaces the Vectors with ArrayLists. Changes were also needed outside BackingStoreHashtable since some of the callers expected Vectors to be returned. I changed them so that they expected List objects instead of Vectors. All tests passed.

Knut Anders Hatlen added a comment - 13/Apr/07 07:21 AM
Committed derby-2493-vector.diff with revision 528374.